How to rename an index in Elasticsearch

I’ve found that Elasticsearch on startup fixes index names to reflect the directory name, which is nice.

This is useful if you want to for example change the logstash index mapping template and don’t want to lose all the data indexed so far and going through a lengthy reindex process or wait until midnight for the index to roll over.

So, this actually works:

  • configure the new index template in logstash
  • shut down cluster
  • rename todays logstash index directory to logstash-2015.03.03_beforenoon
  • restart cluster and elasticsearch figures out that logstash-2015.03.03_beforenoon probably should be opened as logstash-2015.03.03_beforenoon; logstash will notice the missing index for today and fix it with the new template

Nice & almost what I want but I was wondering if I can do the same without shutting down my cluster and restarting it, which kind of a disruptive thing to do in most real environments. After a bit of experimenting, I found that the following works:

PUT /_cluster/settings
{
    "transient" : {
        "discovery.zen.minimum_master_nodes" : 1
    }
}

The actual settings don’t matter, as long as you have something there, any PUT to the settings will basically cause elasticsearch to reload the cluster.

Update. You may want to not do this on a index that is being updated (like typically an active logstash index) since this duplicates lock files that elasticsearch uses. I ended up removing these lock files in my index copy after which it stopped barfing errors about the duplicated lock files. But probably not nice. So probably better is to

  • mv logstash-2015.03.03 logstash-2015.03.03_moved
  • clear out any write.lock files inside the new 2015.03.03_moved dir
  • do the PUT to /_cluster/settings

Elasticsearch failed shard recovery

We have a single node test server with some useful data in there. After a unplanned reboot of the server, elasticsearch failed to recover one shard in our cluster and as a consequence the cluster went red, which means it doesn’t work until you fix it. Kind of not nice. If this was production, I’d be planning an extensive post mortem (how did it happen) and doing some kind of restore from a backup probably. However, this was a test environment. Which meant an opportunity to figure out if the problem can actually be fixed somehow.

I spent nearly two hours to figure out how to recover from this in a way that does not inolve going “ahhh whatever” and deleting the index in question. Been there done that. I suspect, I’m not the only one to get stuck in the maze of half truths, well intentioned but incorrect advice, etc. So, I decided to document the fix I pieced together since I have a hunch this won’t be the last time I have to do this.

This is one topic where the elasticsearch documentation is of little help. It vaguely suggests that this shouldn’t happen that red is a bad color to see in your cluster status. It also provides you plenty of ways to figure out that, yes, your cluster isn’t working and why in excruciating levels of detail. However, very few ways of actually recovering beyond a simple delete and restore backup are documented.

However, you can actually fix things sometimes and I was able to piece together something that works with a few hours of googling.

Step 0 – diagnose the problem

This mainly involves figuring out which shard(s) are the problem. So:

# check cluster status
curl localhost:9200/_cluster/health
# figure out which indices are in trouble
curl 'localhost:9200/_cluster/health?level=indices&pretty'
# figure out what shard is the problem
curl localhost:9200/_cat/shards

I can never remember these curl incantations so nice to have them in one place. Also, poke around in the log. Look for any errors when elasticsearch restarts.

In my case it was pretty clear about the fact that due to some obscure exception involving a “type not found [0]” it couldn’t start shard 2 in my inbot_activities_v29 index. I vaguely recall from a previous episode where I unceremoniously deleted the index and moved on with my life that the problem is probably related to some index format change in between elasticsearch updates some time ago. Doesn’t really matter: we know that somehow that shard is not happy.

Diagnosis: Elasticsearch is not starting because there is some kind of corruption with shard 2 in index inbot_activities_v29. Because of that the whole cluster is marked as red and nothing works. This is annoying and I want this problem to go away fast.

Btw. I also tried the _recovery API but it seems to lack an option to actuall recover anything. ALso, it seems to not list any information for those shards that failed to recover. In my case it listed the four other shards in the index that were indeed fine.

Step 1 – org.apache.lucene.index.CheckIndex to the rescue

We diagnosed the problem. Red index. Corrupted shard. No backups. Now what?

Ok, technically you are looking at data loss at this point. The question is how much data you are going to lose. Your last resort is deleting the affected index. Not great, but it at least gets the rest of the cluster green.

Say you don’t actually care about the 1 or 2 documents in the index that are blocking the shard from loading? Is there a way to recover the shard and nurse the broken cluster back to a working state minus those apparently corrupted documents? That might be a preferable approach to simply deleting the whole index.

The answer is yes. Lucene comes with a tool to fix corrupted indices. It’s not well integrated into elasticsearch. There’s an open ticket in elasticsearch that may involve addressing this. In any case, you can run this tool manually.

Assuming a centos based rpm install:

# OK last warning: you will probably lose data. Don't do this if you can't risk that.

# this is where the rpm dumped all the lucene jars
cd /usr/share/elasticsearch/lib

# run the tool. You may want to adapt the shard path 
java -cp lucene-core*.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /opt/elasticsearch-data/linko_elasticsearch/nodes/0/indices/inbot_activities_v29/2/index/ -fix

The tool displays some warnings about what it is about to do, and if you are lucky reports that it fixed some issues and wrote some segment. Run the tool again and it mentions everything is fine. Excellent.

Step 2 – Convincing elasticsearch everything is fine

Except, elasticsearch is still red. Restarting it doesn’t help. It stays red. This one took me a bit longer to figure out. It turns out that all those well intentioned blogposts that mention the lucene CheckIndex tool sort of leave the rest of the process as an excercise to the reader. There’s a bit more to it:

# go to wherever the translog of your problem shard is
cd /opt/elasticsearch-data/linko_elasticsearch/nodes/0/indices/inbot_activities_v29/2/translog
ls
# note the recovery file; now would be a good time to make a backup of this file because we will remove it
sudo service elasticsearch stop
rm *recovery
sudo service elasticsearch start

After this, elasticsearch came back green for me (see step 0 for checking that). I lost a single document in the process. Very acceptable given the alternative of having to delete the entire index.

Enforcing code conventions in Java

After many years of working with Java, I finally got around to enforcing code conventions in our project. The problem with code conventions is not agreeing on them (actually this is hard since everybody seems to have their own preferences but that’s beside the point) but enforcing them. For the purpose of enforcing conventions you can choose from a wide variety of code checkers such as checkstyle, pmd, and others. My problem with this approach is that checkers usually end up being a combination of too strict, too verbose, or too annoying. In any case nobody ever checks their output and you need to have the discipline to fix things yourself for any issues detected. Most projects I’ve tried checkstyle on, it finds thousands of stupid issues using the out of the box configuration. Pretty much every Java project I’ve ever been involved with had somewhat vague guidelines on code conventions and a very loose attitude to enforcing these. So, you end up with loads of variation in whitespace, bracket placement, etc. Eventually people stop caring. It’s not a problem worthy of a lot of brain cycles and we are all busy.

Anyway, I finally found a solution to this problem that is completely unintrusive: format source code as part of your build. Simply add the following blurb to your maven build section and save some formatter settings in XML format in your source tree. It won’t fix all your issues but formatting related diffs should be a thing of the past. Either your code is fine, in which case it will pass the formatter unmodified or you messed up, in which case the formatter will fix it for you.

<plugin><!-- mvn java-formatter:format -->
    <groupId>com.googlecode.maven-java-formatter-plugin</groupId>
    <artifactId>maven-java-formatter-plugin</artifactId>
    <version>0.4</version>
    <configuration>
        <configFile>${project.basedir}/inbot-formatter.xml</configFile>
    </configuration>

    <executions>
        <execution>
            <goals>
                <goal>format</goal>
            </goals>
        </execution>
    </executions>
</plugin>

This plugin formats the code using the specified formatting settings XML file and it executes every build before compilation. You can create the settings file by exporting the Eclipse code formatter settings. Intellij users can use these settings as well since recent versions support the eclipse formatter settings file format. The only thing you need to take care off is the organize imports settings in both IDEs. Eclipse comes with a default configuration that is very different from what Intellij does and it is a bit of a pain to fix on the Intellij side. Eclipse has a notion of import groups that are each sorted alphabetically. It comes with four of these groups that represent imports with different prefixes so, javax.* and java.*, etc. are different groups. This behavior is very tedious to emulate in Intellij and out of the scope of the exported formatter settings. For that reason, you may want to consider modifying things on the Eclipse side and simply remove all groups and simply sort all imports alphabetically. This behavior is easy to emulate on Intellij and you can configure both IDEs to organize imports on save, which is good practice. Also, make sure to not allow .* imports and only import what you actually use (why load classes you don’t need?). If everybody does this, the only people causing problems will be those with poorly configured IDEs and their code will get fixed automatically over time.

Anyone doing a mvn clean install to build the project will automatically fix any formatting issues that they or others introduced. Also, the formatter can be configured conservatively and if you set it up right, it won’t mess up things like manually added new lines and other manual formatting that you typically want to keep. But it will fix the small issues like using the right number of spaces (or tabs, depending on your preferences), having whitespace around brackets, braces, etc. The best part: it only adds about 1 second to your build time. So, you can set this up and it basically just works in a way that is completely unintrusive.

Compliance problems introduced by people with poor IDE configuration skills/a relaxed attitude to code conventions (you know who you are) will automatically get fixed this way. Win win. There’s always the odd developer out there who insists on using vi, emacs, notepad, or something similarly archaic that most IDE users would consider cruel and unusual punishment. Not a problem anymore, let them. These masochists will notice that whatever they think is correctly formatted Java might cause the build to create a few diffs on their edits. Ideally, this happens before they commit. And if not, you can yell at them for committing untested code: no excuses for not building your project before a commit.

Accessing Elasticsearch clusters via a localhost node

I’m a regular at the Elasticsearch meetup here in Berlin and there are always lots of recent converts that are trying to wrap their head around the ins and outs of what it means to run an elasticsearch cluster. One issue that seems to baffle a lot of new users is the question of which node in the cluster has the master role. The correct answer is that it depends on what you mean by master. Yes, there is a master node in elasticsearch but that does not mean what you think it means: it merely means that a single node is elected to be the node that holds the truth about which nodes have which data and crucially where the master copies of shards live. What it does NOT mean is that that node has the master copy of all the data in the cluster. It also does NOT mean that you have to talk to specifically this node when writing data. Data in elasticsearch is sharded and replicated and shards and their replicas are copied all over the cluster and out of the box clients can talk to any of the nodes for both read and write traffic. You can literally put a load balancer in front of your cluster and round robin all the requests across all the nodes.

When nodes go down or are added to the cluster, shards may be moved around. All nodes synchronize information about which nodes have which shards and replicas of those shards. The elasticsearch master merely is the ultimate authority on this information. Elasticsearch masters are elected at runtime by the nodes in the cluster. So, by default, any of the nodes in the cluster can become elected as the master. By default, all nodes know how to look up information about which shards live where and know how to route requests around in the cluster.

A common pattern in larger cluster is to reserve the master role for nodes that do not have any data. You can specialize what nodes do via configuration. Having three or more such nodes means that if one of them goes down, the remaining ones can elect a new master and the rest of the cluster can just continue spinning. Having an odd number of nodes is a good thing when you are holding elections since you always have an obvious majority of n/2 + 1. With an even number you can end up with two equally sized network partitions.

The advantage of not having data on a node is that it is far less likely for such nodes to get into trouble with e.g. OutOfMemoryExceptions, excessive IO that slows the machine, or excessive CPU usage due to expensive queries. If that happens, the availability of the node becomes an issue and the risk emerges for bad things to happen. This is a bad thing on a node that is supposed to hold the master data for your cluster configuration. It becoming unavailable will cause other nodes to elect a new node as the maste. There’s a fine line between being unavailable and slow to respond, which makes this a particularly hard problem. The now, infamous call me maybe article highlights several different cluster failure scenarios abd most of these involve some sort of network partioning due to temporary master node failures or unavailability. If you are worried about this, also be sure to read the Elasticsearch response to this article. The botton line is that most of the issues have by now been addressed and are now far less likely to become an issue. Also, if you have declined to update to Elasticsearch 1.4.x with your production setup, now might be a good time to read up on the many known ways in which things can go bad for you.

In any case, empty nodes still do useful work. They can for example be used to serve traffic to elasticsearch clients. Most things that happen in Elasticsearch involve internal node communication since the data can be anywhere in the cluster. So, there are typically two or more network hops involved one from the client to what is usually called a routing node and from there to any other nodes that hold shards needed to complete the request that perform the logic for either writing new data to the shard or retrieving data from the shard.

Another common pattern in the Elasticsearch world is to implements clients in Java and make the embedd a cluster node inside the process. This embedded node is typically configured to be a routing only node. The big advantage of this is that it saves you from having to do one network hop. The embedded node already knows where all the shards live so application servers with an embedded node already know where all the shards are in the cluster and can talk directly to the nodes with these shards using the more efficient network protocol that the Elasticsearch nodes use to communicate with each other.

A few months ago in one of the meetups I was discussing this topic with one of the organizers of the meetup, Felix Gilcher. He mentioned an interesting variant of this pattern. Embedding a node inside an application only works for Java nodes and this is not possible if you use something else. Besides, dealing with the Elasticsearch internal API can be quite a challenge as well. So it would be convenient if non Java applications could get similar benefits. Then he suggested the obvious solution that actually you get most of the same benefits of embedding a node simply running a standalone, routing only elasticsearch node on each application server. The advantage of this approach is that each of the application servers communicates with elasticsearch via localhost, which is a lot faster than sending REST requests over the network. You still have a bit of overhead related to serializing and deserializing requests and doing the REST requests. However, all of that happens on localhost and you avoid the network hop. So, effectively, you get most of the benefits of the embedded node approach.

We recently implemented this at Inbot. We now have a cluster of three elasticsearch nodes and two application servers that each run two additional nodes that talk to the three nodes. We use a mix of Java, Javascript and ruby components on our server and doing this allows us to keep things simple. The eleasticsearch nodes on the application server have a comparatively small heap of only 1GB and typically consume few resources. We could probably reduce the heap size a bit further to 512MB or even 256MB since all these nodes do is pass around requests and data from the cluster to the application server. However, we have plenty of memory and have so far had little need to tune this. Meanwhile, our elasticsearch cluster nodes run on three fast 32GB machines and we allocate half of the memory for heap and reserve the rest for file caching (as per the Elasticsearch recommendations). This works great and it also simplifies application configuration since you can simply configure all applications to talk to localhost and elasticsearch takes care of the cluster management.

Eventual Consistency Now! using Elasticsearch and Redis

Elasticsearch promises real-time search and nearly delivers on this promise. The problem with ‘nearly; is that in interactive systems, it is actually unacceptable to not have user changes reflect in the any query results. Eventual consistency is nice but it also means occasionally being inconsistent, which is not so nice for users, or worse, product managers, who typically don’t understand these things and report them as bugs. At Inbot, this aspect of using Elasticsearch has been keeping us busy. It would be awfully convenient if it never returned stale data.

Mostly things actually work fine but when a user updates something and then within a second navigates back to a list of stuff that includes what he/she just updated, chances are that it still has the old version because elasticsearch has not yet committed the change to the index. In any interactive system this is going to be a an issue and one way or another, a solution is needed. The reality is that elasticsearch is an eventually consistent cluster when it comes to search and not a proper transactional store that is immediately consistent after modifications. And while it is reasonably good at catching up in a second, that leaves plenty of room for inconsistencies to surface. While you can immediately get any changed document, it actually takes a bit of time for search results to get updated as well. Out of the box, the commit frequency is once every second, which is enough time for a user to click something and then something else and see results that are inconsistent with actions he/she just performed.

We started addressing this with a few client side hacks like simply replacing list results with what we just edited via the API, updating local caches, etc. Writing such code is error prone and tedious. So we came up with a better solution: use Redis. The same DAO I described in my recent article on optimistic locking with elasticsearch also stores the id of any modified documents in a shortlived data structure in Redis. Redis provides in memory data structures such as lists, sets, and hash maps and comes with a ton of options. The nice thing about Redis is that it scales quite well for small things and has a very low latency API. So, it is quite cheap to use it for things like caching.

So, our idea was very simple: use Redis to keep track of recently changed documents and change any results that include these objects on the fly with the latest version of the object. The bit of Java code that we use to talk to Redis uses something called JedisPool. However, this should pretty much work in a similar way from other languages.

try(Jedis jedis = jedisPool.getResource()) {
  Transaction transaction = jedis.multi();
  transaction.lpush(key, value);
  transaction.ltrim(key, 0, capacity); // right away trim to capacity so that we can pretend it is a circular list
  transaction.expire(key, expireInSeconds); // don't keep the data forever
  transaction.exec();
}

This creates a circular list with a fixed length that expires after a few seconds. We use it to store the ids of any document ids we modify for a particular index or belonging to a particular user. Using this, we can easily find out when returning results from our API whether we should replace some of the results with newer versions. Having the list expire after a few seconds means that it is enough for elasticsearch to catch up and the list will stay short or will not be there at all. Under continuous load, it will simply be trimmed to the latest ids that were added (capacity). So, it stays fast as well.

Each of our DAOs exposes an additional function that tells you which document ids have been recently modified. When returning results, we loop over the results and check the id against this list and swap in the latest version. Simple, easy to implement, and it solves most of the problem and more importantly, it solves it on the server and does not burden our API users or clients with this.

However, It doesn’t fix the problem completely. Your query may match the old document but not the new document and replacing the old document with the new document in the results will make it appear that the changed document actually still matches the query. But it is a lot better than showing stale data to the user. Also, we’re not handling deletes currently but that is trivially supported with a similar solution.

Optimistic locking for updates in Elasticsearch

In a post in 2012, I expanded a bit on the virtues of using elasticsearch as a document store, as opposed to using a separate database. To my surprise, I still get hits on that article on a daily basis. This indicates that there is some interest in using elasticsearch as described there. So, I’m planning to start blogging a bit more again after more or less being too busy with building Inbot to do so since last February.

Continue reading

Nokia Android Phone

It appears that hell is freezing over and there are now strong rumors that on the evening of the completion of the deal with Microsoft, Nokia is going to push out an Android phone.

I’ve been more than a bit puzzled about this apparent move for a few weeks but I think I’ve figured out a possible universe where this actually makes sense. Disclaimer, I’ve been outside of Nokia for quite some time now and don’t have any information that I shouldn’t be sharing. I’m just speculating.

A few days ago Ars Technica published an article where they were recommending that in fact Nokia should not be forking Android, which is what it appears to be doing. One of the big arguments against this was that this isn’t working that well for Amazon either. Amazon has not licensed Google Play Services, which basically is what you need to license to get access to the play store, chrome, google maps, and all the rest of the Google circus. So while Amazon’s Kindles with Android are perfectly nice tablets to use, most Android apps are not available for it because of compatibility issues and because most app developers don’t look beyond the Google store. Blackberry has exactly the same problem (in so far they still have any ambitions in this respect).

Companies like HTC and Samsung have signed licensing deals with Google and this means they have to ship whatever Google tells them to ship and in fact the software updates for anything related to Play Services completely bypass whatever firmware these companies ship and instead updates over the air constantly. This is Google’s fix for the problem that these companies are normally hopelessly behind with updates. I recently played with a Samsung and most of their added value software wise is dubious at best. Most of it is outright crap and most tech savvy users prefer stock android. I know I like my Nexus 5 a lot better at least. Samsung is a hardware manufacturer without a solid software play. Amazon doesn’t want to be in that position, for them the software and hardware business is just a means towards an end: selling Amazon content. They compete with Google on this front and for this reason a deal between the two is unlikely.

So, I was thinking: exactly. It doesn’t make sense for Amazon to be doing this alone. Amazon needs a partner. What if that partner was Nokia + Microsoft? That would change the game substantially.

Amazon has already done a lot of work of trying to provide an implementation of Google’s proprietary APIs. Amazon is already a licensee of Nokia maps and together they could knock up an ecosystem that is big enough to convince application developers that it’s worth porting over to their app store. Microsoft and Nokia need to compete with Android not based on the notion that it is a better platform (because arguably it is not) but primarily based on the notion that it’s app store is filled with third party goodies. It’s the one thing that comes up in every review of a windows phone, blackberry (throwing them in for good measure), and Amazon device. Amazon + Nokia + Microsoft could fix this together. If you fix it for (very) low end phones, you can shove tens of millions of devices into the market in a very short time. That creates a whole new reality.

It seems that is exactly what Nokia is doing (if the rumors and screenshots are right): a low end Android phone with a windows phone like shell and without any of the Google services. One step up from this would be open sourcing the API layer that Amazon has done to provide compatibility with Google’s proprietary play services but instead plugged into competing services from Nokia, Microsoft, and Amazon. That would also be portable to other platforms. Other platforms like for example windows phone that also has had some app store related challenges. Microsoft actually has a lot of code that already makes a lot of sense on Android. For example, mono runs C# and other .Net stuff just fine on Android. With a bit of work, a lot of code could be ported over quite easily. Also, Microsoft and Nokia currently have a lot of Android manufacturers as paying customers. All they are currently getting is a license for the patents they are infringing on. And don’t forget that a lot of Android manufacturers are not necessarily happy with the power grab Google has been executing with Android. Their Play Services is a classic bait and switch strategy where they lured licensees in with open source which is now slowly being replaced with Google proprietary code. That’s why Samsung is making a big push with Tizen in the low end market this year. And it is also why people are eying Ubuntu, Firefox OS, and Sailfish as alternatives to Google.

In short, I’d be very surprised if Nokia was doing this by itself just before it sells the whole phone division. It doesn’t make sense. So, Microsoft has to be in it. And the only way that makes sense is if they want to take this all the way.

Will it work? I don’t know. I’ve seen both Microsoft and Nokia shoot themselves in their collective foots more than enough over the past few years. Both companies have done some amazingly stupid things. There is plenty of room for them to mess this up and they don’t have history on their side at this point. But it could work if they get their act together.