Indexing data with Solr 1.4 from Microsoft SQL Server
With Solr you can index text documents, as well as database records, where each row is treated as a document. To connect to Microsoft SQL Server, I downloaded the Java (type 4) JDBC 3.0 driver for Microsoft SQL Server (6.5, 7, 2000, 2005 and 2008) from:
http://sourceforge.net/projects/jtds/files/jtds/1.2.5/jtds-1.2.5-dist.zip/download
Unzip the contents of the download. Locate the “jtds-1.2.5.jar” file, and copy it to your Tomcat installation directory:
\Tomcat 6.0\lib\
A note on “.jar” files. A jar file is nothing more than a zip file containing a lot of hard work on some software developers’ part. If you were to rename the .jar file to .zip, and unpack the contents, you will find a folder with a “package” structure. When you finally reference a “Class” withing the package you will specify the path to the class via this folder structure. In this case: net\sourceforge\jtds\jdbc ~ net.sourceforge.jtds.jdbc.Driver
So, if you are unsure as to the correct class path to specify, unzip the jar file and locate the Class you are interested in. This will translate to the “Class Path” as mentioned above.
If you downloaded “apache-solr-1.4.0″ as in the previous post, you can now copy a new template for your “Solr Home” from: apache-solr-1.4.0\example\example-DIH\
Copy the entire “solr” folder from the above directory to C:\
If you already have a “solr” directory in this location, rename it to something else not to loose any data. This is just a back-up of your previous work and can be renamed at anytime to reinstate it as your solr home.
Stop Tomcat for the moment, and open the C:\solr directory. Delete the mail- and rss folders including the solr.xml file. Open die db folder and copy both the conf- and lib folder into the solr directory. Delete the db folder.
Your “solr home” should now look something like: C:\solr containing both conf- and lib folders. Start Tomcat.
Visit http://127.0.0.1:8080/solr/ to make sure everything is still running.
(1) Configuring your Driver and Datasource
In this example I am using the “AdventureWorks” database supplied for Microsoft SQL Server 2008. Visit http://www.codeplex.com/MSFTDBProdSamples to download it.
Open the “db-data-config.xml” file in solr\conf.
Edit the “dataSource” node to look like:
<dataSource
type=”JdbcDataSource”
name=”ds-1″
driver=”net.sourceforge.jtds.jdbc.Driver”
url=”jdbc:sqlserver://127.0.0.1:1433;databaseName=AdventureWorks”
user=”sa”
password=”password” />
Substitute the database username and password with your database cridentials.
Edit the “entity” node, found inside the “document” node, to look like:
<entity
dataSource=”ds-1″
name=”People”
query=”SELECT ContactID,
NameStyle,
Title,
FirstName,
MiddleName,
LastName,
EmailAddress,
Phone,
ModifiedDate
FROM Person.Contact”>
<field column=”ContactID” name=”ContactID” />
<field column=”NameStyle” name=”NameStyle” />
<field column=”Title” name=”Title” />
<field column=”FirstName” name=”FirstName” />
<field column=”MiddleName” name=”MiddleName” />
<field column=”LastName” name=”LastName” />
<field column=”EmailAddress” name=”EmailAddress” />
<field column=”Phone” name=”Phone” />
<field column=”ModifiedDate” name=”ModifiedDate” />
</entity>
Save the “db-data-config.xml” file. You will notice that the “driver” attribute in the “dataSource” section looks exactly like the Class Path explained earlier.
(2) Update the Schema.xml for the Solr index
Now open the “schema.xml” file in the same directory as the “db-data-config.xml” file.
Scroll down about 2/3 of the file and find the “<fields>” section and replace all the “<field>” nodes with:
<field name=”ContactID” type=”integer” indexed=”true” stored=”true” required=”true” />
<field name=”NameStyle” type=”string” indexed=”true” stored=”true” />
<field name=”Title” type=”string” indexed=”true” stored=”true” />
<field name=”FirstName” type=”string” indexed=”true” stored=”true” />
<field name=”MiddleName” type=”string” indexed=”true” stored=”true” />
<field name=”LastName” type=”string” indexed=”true” stored=”true” />
<field name=”EmailAddress” type=”string” indexed=”true” stored=”true” />
<field name=”Phone” type=”string” indexed=”true” stored=”true” />
<field name=”ModifiedDate” type=”date” indexed=”true” stored=”true” />
(2.1) Find the “<uniqueKey>” node and change it to: <uniqueKey>ContactID</uniqueKey>;
(2.2) Find the “<defaultSearchField>” node and change it to: <defaultSearchField>LastName</defaultSearchField>;
(2.3) Delete all the “<copyField>” nodes.
Stop and Start Tomcat for Solr to parse the schema.xml and check it against your db-data-config.xml. Any error will be logged to Tomcats’ log file, or you will get an “exception error” when trying to access http://127.0.0.1:8080/solr/.
Now go to http://127.0.0.1:8080/solr/admin/dataimport.jsp and click the /DATAIMPORT link. You are now in the “DataImportHandler Development Console”. At the bottom-left hand side of the window you will notice a button named: “Full Import With Cleaning”. Click it to start indexing. Periodically click the “Status” button to check on the indexing progress.
While solr is still busy indexing, it will display “<str name=”status”>busy</str>”. Once it has successfully completed indexing all the data, the staus will be “idle” with the following message:
- Indexing completed. Added/Updated: 19972 documents. Deleted 0 documents.
Now go to: http://127.0.0.1:8080/solr/admin/ to see if all your hard work paid off. In the “Make a Query” section type *:* and click Search, pressing enter just adds a return.
Also vist http://wiki.apache.org/solr/SolrQuerySyntax for help on Solr Query Syntax! now I’m really late for work. Cheers.
1,362 Comments to “Indexing data with Solr 1.4 from Microsoft SQL Server”
Leave a Reply

Just desire to say your article is as astounding. The clarity in your post is simply spectacular and i can assume you are an expert on this subject. Well with your permission allow me to grab your feed to keep updated with forthcoming post. Thanks a million and please continue the rewarding work.
bread serving tongs…
[...]the time to read or visit the content or sites we have linked to below the[...]…
Great article! shared it with my friends on gmail! You should get a “Google+ 1″ share button too!
I cant get over how little you actually bring to light here. I think that everyones said the same thing that youve said over and over again. Dont you think its time for something more?
Very interesting, shared it with my friends on gmail! You should get a “Google+ 1″ share button too!
Comfortabl y, the post is actually the sweetest on this valuable topic. I harmonise with your conclusions and will eagerly look forward to your approaching updates. Saying thanks will not just be sufficient, for the great clarity in your writing. I will directly grab your rss feed to stay abreast of any updates. Genuine work and much success in your business efforts!
I know this if off topic but I’m looking into starting my own weblog and was curious what all is required to get set up? I’m assuming having a blog like yours would cost a pretty penny? I’m not very internet smart so I’m not 100% certain. Any recommendations or advice would be greatly appreciated. Many thanks
I’ve also been meditating on the exact same factor personally recently. Glad to see somebody on the same wavelength! Nice article.
It’s pretty interesting that the mainstream media has changed the way it looks at this recently dont you think? What used to neve be brought up or discussed has changed. Frankly it is about time we see a change.
I like Your Article about Indexing data with Solr 1.4 from Microsoft SQL Server | coldfury.us Perfect just what I was searching for! .
Really great article with very interesting information. You might want to follow up to this topic!?! 2012
It’s great site, I was looking for something like this