Automatic ID generation in Apache Solr
On this page
I have been working on Apache Solr for last few months, and have been recieving requirements to speed up query process. As part of the investigation, i found out as retrieved documents’ unique id generation contributes query processing.And hence i have decided to add this post.
# Data Structure
Our sample data structure (field section from schema.xml) looks like specified below:
In addition to this, I’ve added the information about which field is the one that should contain unique identifiers. This was also done in schema.xml file:
id
# Solr Configuration
In addition to changes in the schema.xml file, i need to modify the solrconfig.xml file and introduce a proper UpdateRequestProcessorChain like specified below:
id
Above informs Solr that id field contents are to be generated automatically.
# Simple Test
Enough with the configuration, time to test the configuration. Run below command from terminal to update document before querying indexed documents.
If above command runs successfully without any errors, document will get indexed. After then, in order to query below command can be used:
Above will return queried documents specified below:
0
0
true
*:*
Test
1cdee8b4-c42d-4101-8301-4dc350a4d522
1439726523307261952
If you analyze response, you can see the unique identifier was automatically generated. Now if you run same commands ( addition of document & query ) then result would looks like this:
0
1
true
*:*
Test
1cdee8b4-c42d-4101-8301-4dc350a4d522
1439726523307261952
Test
9bedcb5f-1b71-4ab7-80a9-9882a6bf319e
1439726693819351040
As you can see both documents show two different unique identifier generated by solr.