Splunk: A High Level View… In Layers

Sep 16th, 2010

I explain Splunk to my team using a layered approach.

1. Full-Text Search… of the world

Start with a full text search engine. But put _everything_ we have into it.

logs ( host and application )
configuration files (change management)
stats (system performance, monitoring)
alerts (snmp, emails, anything automated)
reports
emails
bugs, tickets, issue and tracker systems
documentation, wikis, and maybe even source code

This is immediately useful in small ways, and is a common practice at large companies anyways. At least for documentation, wikis, tickets. To search for details related to a host when working on a problem ticket, or fixing bugs.

2. Records

Give that full text search engine a knowledge of the ‘record’ structure within all those files. Thus when you search, it can give you just the relevant records from the file. But it can also display records in context or the entire file if desired.

This also lets you restrict search by records, or by transactions made of groups of records, making more powerful searches possible.

3. Fields

Make it cognizant of the ‘fields’ present in those records. This is obviously useful for narrowing queries, as well as sorting, and displaying only relevant info. But these fields are flexible, and resolved in a lazy fashion. They can be different; per file, per record, per search. Furthermore they don’t require the heavy project overhead of schema and datamining planning. The lazy resolution of fields becomes a powerful tool when combined with step #4.

4. Powerful Expression language

Building upon all the structure of records and fields, and the lazy resolution of structure to allow all sorts of complicated processing and data manipulation: splitting, mutating, joining datasets. Not just searching through records, but also data munging, to generate new data and then perform further searches on that.

5. Powerful User Interface.

a CLI with autocompletion
intuitive record/field browsing
automatically populated one-click drilldowns
graphing/visualisation
default dashboards automatically populated with commonly used searches and keywords.
custom dashboards

6. Zeroconf

Splunk detects most of the details and configuration itself. With heavy heuristics that do the right thing most of the time. But can be overridden in the cases they don’t, or just for further control.

The only place you need to do some forethought is some capacity planning.

7. Scalability

And finally, wrap this all up into a componentized architecture so that it scales well, and you can scale just the components that you need to. Whether that be for capacity, or for performance.

If this sounds like some heavy propaganda, it is. I’ve already replaced a number our tools with Splunk at the core. Which is working out great.

Eventually we found that alot of the stuff we were generating and then feeding into Splunk, can be generated more conveniently by Splunk itself. Furthermore replacing alot of homebrew code with more robust, flexible and easier to maintain Splunk applications.

VMWare + Balloon Driver + JVM = Teh Suck

Sep 10th, 2010

I missed the memory talk at VMWorld. I passed it by for some API related sessions. One of my coworkers suggested I go to it and I should have listened to him.

Something that’s popped up recently is the fact that memory reclamation using the balloon driver fails miserably when used in conjunction with a JVM.

These articles put the blame squarely on Java’s garbage collection. The latter pointing out that the garbage collect will scan all the process’s memory making paging a JVM heap worse than paging most applications.

I’d like to see some poeple that know more about JVM memory internals and garbage collection as well as VMWare weigh in on this as well. With virtual machines being the basis of cloud initiatives, and future collocation platforms, this kind of issue is only going to get more prevalent.

Currently, the only solution is have admins take a closer role in managing the JVM memory footprint at the same time as the VMWare VM memory reservations.

I’m not a fan of the maximum heap size setting in the JVM either. But there are arguments for and against it being require in the JVMs style of garbage collection.

Comments

Ben

Hi Jason,

I'm the dev lead on the EM4J project that was spoken about at VMWorld. You're right to surmise that the JVM has a problem with regular ballooning. At it's most simple, the problem is that the JVM doesn't give memory back to the operating system and thus always ends up consuming it's high watermark memory, even if there's plenty of free space in the heap. Since the balloon driver can only reclaim memory from the OS, these two models are basically incompatible. The result is that it's fiendishly difficult to predict when regular ballooning is safe and the consequences of getting it wrong are severe (due to GC characteristics as you state). Hence the current best practice of using memory reservations.

I'm putting together a series of YouTube clips to explain this whole area in more detail. Hopefully these will be helpful. First one is up now: http://www.youtube.com/watch?v=kyz7J-FQUSM

Fighting the Good Fight

Sep 10th, 2010

Alex Maier and I in front of the “Blogger Lounge” at VMWorld in San Francisco. I told her I was a blogger, but I don’t think she believed me.

Never Create a RuntimeException

Aug 19th, 2010

Don’t worry, this isn’t a post about “checked exceptions”.

Of course the post-title isn’t an iron-clad rule, but you really don’t have any excuse to have the following code in your application.

throw new RuntimeException("innocuous message");

You can catch them, and throw them again if caught. But RuntimeException is just meant to be a super class for you to extend from for your own exceptions. In any error situation, there is always a more application exception class that already exists. Or you can consider creating one for your app.

If the issue is with something passed into the method, you can always throw the very popular IllegalArgumentException. Or, In the absolute worst case, you can always just throw a new IllegalStateException(), as that’s the most generic RuntimeException.

There are even handy libraries to replace your state and argument checking boiler plate code. See Preconditions from Google Guava (formerly google collections), which is far more readable.

import static com.google.common.base.Preconditions.*;

public Object getCachedValue(Identifier id) { 
    checkNotNull(id);
    Object value = cache.get(id);
    checkState("id not cached",value != null);
    return value;
}

Even in test code, you could probably throw an AssertionException instead, if your not using
asserts to test the state anyways.

A personal favorite of mine is UnsupportedOperationException, instead of a custom UnimplementedFeatureException. Whenever code tries to do something that isn’t _supposed_ to work (yet). Because, frankly, I’m quite often lazy, or just too busy to write everything at the start.

Furthermore, when you create your own RuntimeExceptions, consider sub-classing IllegalStateException or some other reasonable exception. These super-classes, while adding no functionality, give hints to other developers about the intrinsic meaning behind these new exception their seeing.

Comments

Marian

That is one fascist thinking, I'd say :D

pholser

Right on. IMO, Throwable, Exception, RuntimeException, and Error needed to be made abstract.

Marian

Thanks for the opinion.
But I still do not see any point in subclassing RuntimeException. Does it help anything except for the code to look prettier? Unchecked XYZMySpecialException may in my experience create much more confusion than simple RuntimeException. You may end up with your junior programmers simply catching and ignoring everything subclassed from Exception in that piece of code. Or make them attracted to throw them in other places it was supposed to be thrown.
That it was designed just to be subclassed? Who says? Would not it be abstract then?
The only point of subclassing unchecked exception is to provide some extra information for the exposed class/method's user. Unless you are trying to collect some runtime statistics everyone needs to look at the exception message anyway.
I do throw IllegalArgumentException-s in case an input argument is wrong. But that is the only small sacrifice I am willing to take with this. Any RuntimeException should in my view represent a truly exceptional and and unexpected situation. And which should really not happen in production.
Your arguments, in my opinion, go well for checked exceptions. Throwing whose has its non-academic purpose in production code too.

Conferences

Aug 19th, 2010

I have trouble learning anything at conferences. Though, I’ve only been to a few. This is a retelling of what I told one of the Splunk people at their conference. Its starts with;

“I’m probably just not the right type of person for conferences.”

Don’t get me wrong. Their conference was great. But I’m one of the few people in the world that actually reads documentation… thoroughly. And Splunk is _very_ well documented. Better than most applications.

Furthermore I read blogs, lots of them. From that I not only gain the technical knowledge of ‘cans’ and ‘cannots’ but also use-cases, user experiences, corner cases, best practices and odd behavior.

Most sessions at conferences, end up being rehashes of what I’ve already read. With only a few small new tidbits in the entire hour or hour and a half.

Furthermore, I’m not a swag type of person. And usually if I was interested in talking with a company regarding their products, I would already have done so. Instead of waiting until I can meet them face-to-face in a exhibit hall.

I’ll probably be at VMWorld in SF, and at Strangeloop, in St. Louis. Before then, I’m hoping someone will chime in and tell me what I’m doing wrong

Google Wave

Aug 18th, 2010

Everyone seems to have an opinion on this one. Now that Google Wave has been put in the can, its getting some revived press. I’ve never understood the mysticism assigned to it. It seems rather straightforward to me. Its just a collaborative document edting system. Conceptually you could think of it as just another wiki.

Wiki

Users can edit it, including editing what the previous users added.
history is kept and can be inspected through the UI

It adds a realtime editing feature, where you can see others making changes to the document live. Whether thats an important feature or not is left in the open. The realtime editing is nice, but I’ve rarely seen it helpful in online collaboration. Besides I don’t think thats part of the design of their protocol, its just a feature of their front-end implementation.

The one situation it has been helpful is when you’re

working on a specific document, like a requirmenets doc,
with other people,
who your in other realtime communication with, such as over chat, voice or in person.

Notification or “adding people” to the wave, is no different from getting email notifications when someone edits a wiki page your following or have previously edited.

Email

The documents are like “threads”.
You get notified of additions/changes to the thread.

Its good replacement for long threads. Long meandering threads would quickly get policed back onto track. Because of the constant presence of the full history, and the editable nature of the full thread. But again, this is the case for any wiki.

In the spirit of email, the backend presents it as a decentralized repository. Allowing integration and distribution with other repos. Thats a new concept for online collaboration systems, but it never branched out for them. With Google as the only provider.

Chat

Its not a chat system. I never understood this comparison. That would be too random, off-topic, long term. It would quickly become more of a ‘history’ of the chat, than an actual chat. And keeping it on topic without constant editing of all the history would be impossible.

Conclusion

I did think it was a good idea. I love wiki’s, online collaboration, and collective intelligence of any kind. Perhaps, if they had provided some of the other front-ends instead of their complicated “wave” UI, it would have seen more wide-spread use.

Email

receive updates as regular emails, possibly containing context as a ‘fake threaed’
reply with reponses that get inserted appropriately back into the document.

Wiki

static html of each wave
editing the document in a normal wiki fashion.

Swing Extreme Testing

May 1st, 2010

I’ll admit, I hadn’t read any books on UI testing before. But I have an understanding of the difficulties in it from my own failed attempts, as well as the various blog posts I’ve read over the years. Like many I just chalked it up to too hard to do well.

Fortunately, this book was a big help as a first step. The techniques it suggests are simple and helpful. And show that UI testing doesn’t have to be a complete beast.

I agree with most other reviews though. The custom unit testing system was overkill. But the real-world scenarios used to present the tests where good and easy to read. They weren’t presented as the long, detailed case studies that many testing books make the mistake of doing.

If you’re looking for a first book on UI testing, its a good choice. And the patterns can be applied to UI frameworks and languages other than Java Swing.

Blog Archives Newer →

Den of Antiquity

Dusting off old ideas and passing them off as new.

Splunk: A High Level View… In Layers

VMWare + Balloon Driver + JVM = Teh Suck

Comments

Fighting the Good Fight

Never Create a RuntimeException

Comments

Conferences

Google Wave

Wiki

Email

Chat

Conclusion

Email

Wiki

Swing Extreme Testing