Indexing data with Solr 1.4 from Microsoft SQL Server

With Solr you can index text documents, as well as database records, where each row is treated as a document. To connect to Microsoft SQL Server, I downloaded the Java (type 4) JDBC 3.0 driver for Microsoft SQL Server (6.5, 7, 2000, 2005 and 2008) from:

http://sourceforge.net/projects/jtds/files/jtds/1.2.5/jtds-1.2.5-dist.zip/download

Unzip the contents of the download. Locate the “jtds-1.2.5.jar” file, and copy it to your Tomcat installation directory:

\Tomcat 6.0\lib\

A note on “.jar” files. A jar file is nothing more than a zip file containing a lot of hard work on some software developers’ part. If you were to rename the .jar file to .zip, and unpack the contents, you will find a folder with a “package” structure. When you finally reference a “Class” withing the package you will specify the path to the class via this folder structure. In this case:  net\sourceforge\jtds\jdbc ~ net.sourceforge.jtds.jdbc.Driver

So, if you are unsure as to the correct class path to specify, unzip the jar file and locate the Class you are interested in. This will translate to the “Class Path” as mentioned above.

If you downloaded “apache-solr-1.4.0″ as in the previous post, you can now copy a new template for your “Solr Home” from: apache-solr-1.4.0\example\example-DIH\

Copy the entire “solr” folder from the above directory to C:\

If you already have a “solr” directory in this location, rename it to something else not to loose any data. This is just a back-up of your previous work and can be renamed at anytime to reinstate it as your solr home.

Stop Tomcat for the moment, and open the C:\solr directory. Delete the mail- and rss folders including the solr.xml file. Open die db folder and copy both the conf- and lib folder into the solr directory. Delete the db folder.

Your “solr home” should now look something like: C:\solr containing both conf- and lib folders. Start Tomcat.

Visit http://127.0.0.1:8080/solr/ to make sure everything is still running.

(1) Configuring your Driver and Datasource

In this example I am using the “AdventureWorks” database supplied for Microsoft SQL Server 2008. Visit http://www.codeplex.com/MSFTDBProdSamples to download it.

Open the “db-data-config.xml” file in solr\conf.

Edit the “dataSource” node to look like:

<dataSource
type=”JdbcDataSource”
name=”ds-1″
driver=”net.sourceforge.jtds.jdbc.Driver”
url=”jdbc:sqlserver://127.0.0.1:1433;databaseName=AdventureWorks”
user=”sa”
password=”password” />

Substitute the database username and password with your database cridentials.

Edit the “entity” node, found inside the “document” node, to look like:

<entity
dataSource=”ds-1″
name=”People”
query=”SELECT    ContactID,
NameStyle,
Title,
FirstName,
MiddleName,
LastName,
EmailAddress,
Phone,
ModifiedDate
FROM    Person.Contact”>

<field column=”ContactID” name=”ContactID” />
<field column=”NameStyle” name=”NameStyle” />
<field column=”Title” name=”Title” />
<field column=”FirstName” name=”FirstName” />
<field column=”MiddleName” name=”MiddleName” />
<field column=”LastName” name=”LastName” />
<field column=”EmailAddress” name=”EmailAddress” />
<field column=”Phone” name=”Phone” />
<field column=”ModifiedDate” name=”ModifiedDate” />

</entity>

Save the “db-data-config.xml” file. You will notice that the “driver” attribute in the “dataSource” section looks exactly like the Class Path explained earlier.

(2) Update the Schema.xml for the Solr index
Now open the “schema.xml” file in the same directory as the “db-data-config.xml” file.

Scroll down about 2/3 of the file and find the “<fields>” section and replace all the “<field>” nodes with:

<field name=”ContactID” type=”integer” indexed=”true” stored=”true” required=”true” />
<field name=”NameStyle” type=”string” indexed=”true” stored=”true” />
<field name=”Title” type=”string” indexed=”true” stored=”true” />
<field name=”FirstName” type=”string” indexed=”true” stored=”true” />
<field name=”MiddleName” type=”string” indexed=”true” stored=”true” />
<field name=”LastName” type=”string” indexed=”true” stored=”true” />
<field name=”EmailAddress” type=”string” indexed=”true” stored=”true” />
<field name=”Phone” type=”string” indexed=”true” stored=”true” />
<field name=”ModifiedDate” type=”date” indexed=”true” stored=”true” />

(2.1) Find the “<uniqueKey>” node and change it to: <uniqueKey>ContactID</uniqueKey>;

(2.2) Find the “<defaultSearchField>” node and change it to: <defaultSearchField>LastName</defaultSearchField>;

(2.3) Delete all the “<copyField>” nodes.

Stop and Start Tomcat for Solr to parse the schema.xml and check it against your  db-data-config.xml. Any error will be logged to Tomcats’ log file, or you will get an “exception error” when trying to access http://127.0.0.1:8080/solr/.

Now go to http://127.0.0.1:8080/solr/admin/dataimport.jsp and click the /DATAIMPORT link. You are now in the “DataImportHandler Development Console”. At the bottom-left hand side of the window you will notice a button named:  “Full Import With Cleaning”. Click it to start indexing. Periodically click the “Status” button to check on the indexing progress.

While solr is still busy indexing, it will display “<str name=”status”>busy</str>”. Once it has successfully completed indexing all the data, the staus will be “idle” with the following message:

- Indexing completed. Added/Updated: 19972 documents. Deleted 0 documents.

Now go to: http://127.0.0.1:8080/solr/admin/ to see if all your hard work paid off. In the “Make a Query” section type *:* and click Search, pressing enter just adds a return.

Also vist http://wiki.apache.org/solr/SolrQuerySyntax for help on Solr Query Syntax! now I’m really late for work. Cheers.

1,362 Comments to “Indexing data with Solr 1.4 from Microsoft SQL Server”

  1. San Jose rug cleaning 24 August 2011 at 6:09 pm #

    I really like your wordpress web template, where did you obtain it from?

  2. Lily Lafuze 24 August 2011 at 6:15 pm #

    I would like to express some thanks to the writer for rescuing me from this issue. Right after checking through the internet and finding opinions that were not pleasant, I was thinking my entire life was over. Being alive minus the answers to the issues you have fixed by way of the post is a critical case, and those that might have badly damaged my career if I had not come across your blog. Your actual natural talent and kindness in handling a lot of things was invaluable. I’m not sure what I would have done if I had not discovered such a stuff like this. I can also now relish my future. Thank you very much for the reliable and results-oriented guide. I will not be reluctant to recommend the blog to any individual who needs and wants guidelines on this subject matter.

  3. discount ugg boots 24 August 2011 at 6:19 pm #

    This is a very helpful post, I was looking for this info. Just so you know I located your webpage when I was looking around for blogs like mine

  4. Article on The Topic…

    …When you are aware what is your job you can be a lot more successful than when you have no knowledge….[...]…

  5. Replica Watches 24 August 2011 at 7:01 pm #

    I do love the way you have presented this concern plus it does provide us a lot of fodder for consideration. Nonetheless, because of everything that I have personally seen, I basically hope as the commentary pile on that individuals stay on issue and don’t start upon a soap box associated with some other news du jour. AnywayReplica Audemars Piguet, thank you for this exceptional piece and even though I do not agree with it in totality, I respect the viewpoint.

  6. Hailey Tydeman 24 August 2011 at 7:13 pm #

    fantastic points altogether, you simply gained a new reader. What would you suggest in regards to your post that you made a few days ago? Any positive?

  7. accounts receivables solutions 24 August 2011 at 7:31 pm #

    AKTKPI Cheap Valium

  8. medical doctors diet 24 August 2011 at 7:33 pm #

    I think one of your adverts triggered my browser to resize, you might want to put that on your blacklist.

  9. Calorad 24 August 2011 at 7:33 pm #

    Thanks, I have recently been searching for details about this subject for ages and yours is the best I have found so far.

  10. Real Estate Lewes Delaware 24 August 2011 at 8:22 pm #

    Tracy thank you so much for sharing your journey. It has been a hard one and Im so thankful you are with us, and that your heart is being warmed, and that you are blessed with a good church. You add significantly to our group, sharing the beauty, and I thank God for you.

  11. cheap uggs online 24 August 2011 at 8:42 pm #

    Good post.Thx

  12. real estate Lewes de 24 August 2011 at 8:57 pm #

    Alright Aaron, I hear what youre saying. I still favor Bozhanov – he just strikes me as such an amazing musician – depth and range beyond compare! : )

  13. Real Estate Lewes Delaware 24 August 2011 at 9:13 pm #

    The only new argument added to the mix is gays cant have kids. And yet, as pointed out several times before, no-one suggests that women be disallowed from marriage after the menopause.

  14. Web Design St Louis 24 August 2011 at 9:19 pm #

    i like this blog too bad you have so many unrelated comments

  15. dance workout 24 August 2011 at 9:29 pm #

    Your site rocks!

  16. cash 4 that mobile 24 August 2011 at 9:35 pm #

    I love your wp template, exactly where did you get a hold of it from?

  17. real estate Lewes de 24 August 2011 at 9:49 pm #

    Anne I also recommend reading March 10 entry in Streams in the Desert. It pertains to sensing the presence of God. Amazing how God is using each of us to help one another on the journey.

  18. Real Estate Lewes Delaware 24 August 2011 at 9:50 pm #

    At long last! I have been seeking details on lego for a while and found it, my son should be pleased!

  19. Egipt last minute 24 August 2011 at 10:25 pm #

    Howdy, i read your blog often and i own an similar one and i used to be simply curious if you get a lot of spam comments? If thus how do you stop it, any plugin or anything you’ll be able to advise? I get thus often lately it’s driving me mad so any help is very abundant appreciated. for older folks, retirement or a huge promotion at work is a time when celebration events are going to be a great idea.

  20. real estate Lewes de 24 August 2011 at 10:28 pm #

    Ya I do see the cubs in the world series in 2007, if all the other teams are disqualified for wearing their jerseys inside out

  21. fishin 2 go apk 24 August 2011 at 10:34 pm #

    I agree with your Indexing data with Solr 1.4 from Microsoft SQL Server | coldfury.us, superb post.

  22. farmville cheat engine 24 August 2011 at 10:34 pm #

    I agree with your Indexing data with Solr 1.4 from Microsoft SQL Server | coldfury.us, fantastic post.

  23. humans vs aliens apk 24 August 2011 at 10:34 pm #

    I agree with your Indexing data with Solr 1.4 from Microsoft SQL Server | coldfury.us, superb post.

  24. Real Estate Lewes Delaware 24 August 2011 at 10:37 pm #

    A note regarding the use of non-serif fonts for on-screen, and serif fonts for printed text. I work in technical publications, including Web stuff, and a lot of my friends and acquaintences are Web designers, so this is not a new conversation for me.

  25. Real Estate Lewes Delaware 24 August 2011 at 11:07 pm #

    I am wondering whether some enhancements in the Liszt-Gounod Faust waltz were Evgenis or alternatives by Liszt (such as extra notes in the bass octaves and more exotic harmonic progressions etc). I know this piece very well and have played it in recital in the Peters Edition version. Liszt wrote several versions of many of his pieces and maybe this piece as well. I will have to ask Leslie Howard about this as he knows all the versions inside out. I found the variations or ossias very interesting. Its quite a dangerous piece with many opportunities to land on wrong notes. I think he played it very well.

  26. 30 second smile review 24 August 2011 at 11:10 pm #

    I must show some appreciation to this writer for rescuing me from this instance. After looking out through the the net and finding tips which are not helpful, I figured my life was over. Existing minus the answers to the difficulties you’ve resolved through the review is a critical case, as well as those which could have in a negative way affected my career if I hadn’t encountered your blog post. Your good natural talent and kindness in controlling a lot of things was valuable. I’m not sure what I would have done if I had not encountered such a point like this. I’m able to now relish my future. Thanks for your time so much for your professional and results-oriented help. I will not think twice to suggest your blog post to anyone who ought to have guidance about this subject.

  27. Real Estate Lewes Delaware 24 August 2011 at 11:24 pm #

    Yes it is, SugarJar. Open Source software is software whose code is freely distributed. There are many different open source licenses, but thats the nut of the matter. (The Open Source Initiatives definition can be found here: opensource.org/osd.html. They also have a list of licenses here: opensource.org/licenses/index.html.)

  28. Real Estate Lewes Delaware 24 August 2011 at 11:29 pm #

    I have not seen Bergman’s SCENES FROM A MARRIAGE but what you describe does seem very moving and sad.

  29. real estate Lewes de 24 August 2011 at 11:49 pm #

    What would a conservatory teacher say? What would Clara say?

  30. tinnitus remedies 25 August 2011 at 12:26 am #

    Hi there! Do you use Twitter? I’d like to follow you if that would be ok. I’m definitely enjoying your blog and look forward to new posts.

  31. brand management 25 August 2011 at 12:32 am #

    I loved as much as you’ll receive carried out right here. The sketch is attractive, your authored material stylish. nonetheless, you command get got an shakiness over that you wish be delivering the following. unwell unquestionably come more formerly again as exactly the same nearly very often inside case you shield this increase.

  32. Zahnarzt Kiel 25 August 2011 at 12:48 am #

    Hey! Quick question that’s entirely off topic. Do you know how to make your site mobile friendly? My site looks weird when browsing from my apple iphone. I’m trying to find a theme or plugin that might be able to fix this problem. If you have any recommendations, please share. Cheers!

  33. real psychic 25 August 2011 at 12:55 am #

    Study from new books but from old teachers

  34. pranie dywanów Warszawa 25 August 2011 at 1:07 am #

    regards from NY

  35. real estate Lewes de 25 August 2011 at 1:09 am #

    James My second favorite team too, although I wish they were still in Seattle. I believe, like you said, that if the core stays intact, they could win a few. As long as they dont have to beat the Celtics in the Finals, I Want to see Kevin Durant get a ring in the next two years.

  36. Superb website…

    [...]always a big fan of linking to bloggers that I love but don’t get a lot of link love from[...]……

  37. penis pills 25 August 2011 at 1:20 am #

    It is critical to understand that you can just assist what exactly characteristics has got granted anyone, if you decide to have a 5 inch member, you’ll never obtain a Ten inch member. Likewise many other factors for example adhering to a regular program, circulation of blood to the male member for example. Most males statement all-around a 2 ” obtain. Having said that, numerous men get reported results up to A few inches width.

  38. Brad Fallon 25 August 2011 at 1:28 am #

    Congratulations for posting such a useful blog. Your blog isn’t only informative but also extremely artistic too. There usually are extremely couple of individuals who can write not so easy articles that creatively. Keep up the good writing !!

  39. tarot visa 25 August 2011 at 1:40 am #

    Apple now has Rhapsody as an app, which is a great start, but it is currently hampered by the inability to store locally on your iPod, and has a dismal 64kbps bit rate. If this changes, then it will somewhat negate this advantage for the Zune, but the 10 songs per month will still be a big plus in Zune Pass’ favor.

  40. nod32 free download 25 August 2011 at 1:46 am #

    I love your site

  41. Lump Sum Profits Review 25 August 2011 at 1:56 am #

    Nice blog here! Also your site loads up fast! What web host are you using? Can I get your affiliate link to your host? I wish my website loaded up as quickly as yours lol

  42. us cities database 25 August 2011 at 2:05 am #

    More often than not I don’t comment on a person’s blog, but I would just like to mention that this post really has forced me to do so! Thank you for your insightful post.

  43. real estate Lewes de 25 August 2011 at 2:17 am #

    were here, were brownon you bums we will go to town

  44. real estate Lewes de 25 August 2011 at 3:08 am #

    there are a few songswhich one?

  45. Francisco Schnettler 25 August 2011 at 3:10 am #

    Excellent weblog. Plenty of helpful information here. I was looking for this.Thanks for sharing!

  46. shed building 25 August 2011 at 3:59 am #

    Hello admin, thanks for posting this useful article. It really helps me a lot

  47. fishin 2 go full 25 August 2011 at 4:16 am #

    Thanks i love your article about Indexing data with Solr 1.4 from Microsoft SQL Server | coldfury.us

  48. watch bears vs saints 25 August 2011 at 4:47 am #

    This actually answered my problem, thank you!

  49. Lovetta Foisy 25 August 2011 at 5:03 am #

    Hmm is anyone else having problems with the pictures on this blog loading? I’m trying to find out if its a problem on my end or if it’s the blog. Any responses would be greatly appreciated.

  50. tarot 25 August 2011 at 5:50 am #

    Zune and iPod: Most people compare the Zune to the Touch, but after seeing how slim and surprisingly small and light it is, I consider it to be a rather unique hybrid that combines qualities of both the Touch and the Nano. It’s very colorful and lovely OLED screen is slightly smaller than the touch screen, but the player itself feels quite a bit smaller and lighter. It weighs about 2/3 as much, and is noticeably smaller in width and height, while being just a hair thicker.


Leave a Reply