Category Archives: Uncategorized

How to listen for key presses with AngularJS

I needed to create a table that can be browsed around with the arrow keys in an AngularJS app. The documentation about ng-keypress on the AngularJS site is pretty minimal. E.g. it does not mention that you can call a function that passes an $event object containing the keypress. I created a small jsfiddle demo to show you how to catch virtually all key presses on your Angular site, like arrow keys, return, function keys, all letters and numbers, etc.

One word of warning: this works for angularjs 1.1.5 and up. Check out the JSFiddle: Catching key presses with AngularJS

 

Understanding HBase and BigTable

Just wanted to share this. For those of you that can’t really get a grasp on what HBase is, this blog post explains it really well.

ElasticSearch backups

So you discovered ElasticSearch, great! A question that should come to mind once you start using it seriously is: can I create backups of my indexes. Luckily you can! There are several options that are not very user friendly.

First, you can make filesystem level backups of the indexes, but with a big cluster this means you have to copy the data on each node.

You can also use a shared gateway and backup the data from the gateway. I would not advise a shared gateway because the whole point of ElasticSearch is not having a single point of failure. Actually, the shared gateway is deprecated by ElasticSearch now so don’t even think about exploring that option now!

The third option is to use the scan and scroll API call that ElasticSearch offers. These two calls allow you to scan all (or a subset) of your data and walk over the result set by repeatedly calling scroll. I have tested this on quite some data (200GB) and this works surprisingly well. That is why I decided to add a dump and import script to my open source project ESClient (Python), to save you from the trouble of having to reinvent the wheel 😉

If you install ESClient (with pip install esclient or easy_install esclient) you get these two scripts installed automatically. You can use them by simple entering esdump or esimport on the command line and they will show you usage information.

As an example, suppose you have an index called ‘items’ and another called ‘customers’. You can backup this index to a bz2 file using:

esdump --url http://localhost:9200/ --indexes items customers --bzip2 --file items_customers.bz2

You can import this data using:

esimport --url http://localhost:9200 --file items_customers.bz2

Alternatively, you can import the data back to another index, e.g. items_test, by using the –index option on esimport.

Some notes

These two scripts currently support indexes that have the following fields: _parent, _routing. If you supplied a specific routing at index time, that will be restored too. The same holds true if you specified a parent/child relation.

Not supported are indexes in which you don’t store the _source field. You can not backup an index without this field.

Future plans

It is relatively simple to also backup the mapping of the data, so this is high on my priority list. Also, I want to check the cluster state before dumping the data, to ensure you are not backup up a cluster that is in a bad state (Yellow or Red).

P.S.: from what I understand, ElasticSearch.org is working hard towards a 1.0 version which will offer backup and restore functionality out of the box!

ESClient: Python ElasticSearch Client

I’ve been working on a little project: a Python client for ElasticSearch. Are there other clients out there? Yeah sure, and they are even pretty decent. Especially pyes is OK. But I’m missing documentation on that one and I don’t like the approach they took on implementing the API. I wanted to create a simple client that stays close to the ElasticSearch REST API. It allows you to directly submit either JSON or a tree of Python objects that can be converted to JSON. ESClient has documented code and supplied unittests that should get you a long way. I plan to write good documentation to get people started quickly as soon as the API starts to reach a stable state.

In a few days I have learned a lot about:

It’s simply awesome how quick and easy you can get something up and running these days. With just a few lines of code and Pythons distutils you can create a source package and a Windows binary, upload it all to PyPI and make it installable for everyone in the world with a single command:

pip install esclient
or:
easy_install esclient

ESClient is still in its early stages and I can assure you the API will change over the coming weeks. I need to implement more API methods and I want to implement bulk indexing. If all API methods are implemented and bulk indexing works I will work towards a stable 1.0.0 release that will also have a stable API. I also want ESClient to handle errors well. This is something that I missed in some other libraries that I have found, like pyelasticsearch.