Wednesday, 12 June 2013

Record Search with LINQ



A suprisingly simple search technique that will work for SQL tables, XML files, DataTables, etc.

Linq is just awesome.  I was recently asked a question on how to search through a data file with linq.  At first it seemed difficult, but after some though and fighting my way through the strange quirks, I realized how much easier Linq could [once again] make my life.

Here is how it's done (this article uses XML, but I have done the same exact thing with both SQL and DataTables, the concept is the same.)

The original question stirred from how to search through a bunch of recipes.  Each recipe (XML elements bolded) has a name, type (chicken, vegetable, etc.), calories that it contained, the amount of people that the recipe serves, and the instructions on how to make it.  Here is the dummy XML:

<?xml version="1.0" encoding="utf-8" ?>
<recipes>
    <recipe>
        <name>Hamburger</name>
        <serves>1</serves>
        <calories>500</calories>
        <type>Beef</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Chicken Nuggets</name>
        <serves>4</serves>
        <calories>400</calories>
        <type>Chicken</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Lasagna</name>
        <serves>8</serves>
        <calories>800</calories>
        <type>Pasta</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Ribeye</name>
        <serves>1</serves>
        <calories>600</calories>
        <type>Beef</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Drumsticks</name>
        <serves>3</serves>
        <calories>700</calories>
        <type>Chicken</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Beef Kabobs</name>
        <serves>4</serves>
        <calories>350</calories>
        <type>Beef</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
    <recipe>
        <name>Green Beans</name>
        <serves>5</serves>
        <calories>50</calories>
        <type>Vegetables</type>
        <instructions>throw some crap together and cook it</instructions>
    </recipe>
</recipes>

Now that we know our data structure, we can figure outhow to set up this search.  I am going to show a few different approaches so you can pick and choose which ones to use.  I am going to cover keyword search, number range search and specific text search.

First we have name search.  For that I am going to look for keywords.  I will allow users to enter none or as many as they want (to be more specific) into a TextBox.  These will be delimited with the standard space( ), comma(,) and semicolon(;).

Next there is type and serves which I am going to set to DropDownLists.  I am going to do this because not all numbers/words will be supported, I feel it is best to guide the user along with this and only supply available search possibilities, with 'any' always being the first option.

Finally there is the calories range.   For this, I will use two TextBoxes that will take in numbers of course; a minimum and a maximum.

Here is what the search interface looks like:

http://naspinski.net/image.axd?picture=search.jpg

Here is the markup:

<fieldset>
    <legend>Search Recipes</legend>
    <ul>
        <li>
            <label for="name">Name</label>
            <asp:TextBox runat="server" ID="txtName" />
        </li>
        <li>
            <label for="type">Type</label>
            <asp:DropDownList ID="ddlType" runat="server" />
        </li>
       
        <li>
            <label for="serves">Serves:</label>
            <asp:DropDownList ID="ddlServes" runat="server" />
        </li>

        <li>
            <label for="calories">Calories</label>
            <asp:TextBox ID="txtCalMin" runat="server" Columns="3" />
            <span style="float:left"> to </span>
            <asp:TextBox ID="txtCalMax" runat="server" Columns="3" />
        </li>

        <li>
            <asp:Button ID="btnSearch" runat="server" Text="Search"
                onclick="btnSearch_Click" />
        </li>
    </ul>
</fieldset>

Now with that out of the way, we can start with the code.  First a global XElement x; and IEnumerable<XElement> filteredResults; has to be declared, x will be used within the program and initialized on Page_Load, filteredResults will be explained later.  After that is initialized, on a fresh page load (!IsPostBack)the DropDownLists must be populated:

protected void Page_Load(object sender, EventArgs e)
{
    x = XElement.Load(Server.MapPath(".") + "\\App_Data\\recipes.xml");
    if(!IsPostBack)
    {
        // get all of your different food 'types' and put them in the ddl
        ddlType.Items.Add("any");
        foreach(string s in ((from p in x.Descendants("type") orderby p.Value select p.Value).Distinct()))
            ddlType.Items.Add(s);
        ddlServes.Items.Add("any");
        foreach (string s in (from p in x.Descendants("serves") orderby p.Value select p.Value).Distinct())
            ddlServes.Items.Add(s);
    }
}

Now that the form is all ready,it is time to delve in to the actualy search.  Now keep in mind, the Linq syntax will be a bit differen, but the methods are teh EXACT same when you are working with another data type (SQL, DataTable, etc.)  After a lot of deliberation on how to do this, I decided to split it up in to two seperate parts.  First filter the elements using all the filters EXCEPT the keywords (name), then apply the keyword search.  The reason I am doing it this way is because that search could include zero terms, or 5,000; therefore iteration makes the least amount of work.  Also, for ease of use, the search will workregardless of what a user enters.  By default, the search will return everything, then narrow it down as users select more criteria.  For this I will make it so the default min/max calories are 0/9999 respectively (I will ignore any input that isn't integers) and make sure to ignore the type and  serves if 'any' is selected.  Here is the code for search, I will explain it afterwards:

protected void btnSearch_Click(object sender, EventArgs e)
{
    try
    {
        string[] searchTerms = txtName.Text.Split(new string[] {" ", ",", ";"}, StringSplitOptions.RemoveEmptyEntries); //gets all your search terms
        int calMin = 0;
        int calMax = 9999;
        try { calMin = int.Parse(txtCalMin.Text); }catch { }
        try { calMax = int.Parse(txtCalMax.Text); }catch { }
        if (calMin > calMax) Response.Write("<h2 style=\"color:red;\">Error: Calories<div style=\"font-size:.5em;\">Minimum can not be larger than maximum</div></h2>");
      
        var searchResults = from p in x.Descendants("recipe") // filters everything by all of the fields except Name
                        where
                            (ddlType.SelectedIndex == 0 ? true : p.Element("type").Value.Equals(ddlType.SelectedValue.ToString())) &&
                            (ddlServes.SelectedIndex == 0 ? true : p.Element("serves").Value.Equals(ddlServes.SelectedValue)) &&
                            (int.Parse(p.Element("calories").Value) >= calMin) &&
                            (int.Parse(p.Element("calories").Value) <= calMax)
                    select p;

        foreach (string s in searchTerms) // since name can be multiple words, this iterates through them all, making sure that all of the terms are present
            searchResults = from p in searchResults where p.Element("name").Value.ToLower().Contains(s.ToLower()) select p;

        if (searchResults.Count() > 0)
        {
            // output your results
            foreach (XElement xe in searchResults)
                pnlOutput.Controls.Add(new LiteralControl("<a href=\"#" + xe.Element("name").Value + "\"><div><h3>" + xe.Element("name").Value + "</h3> Calories: " + xe.Element("calories").Value + "</div></a>"));
        }
        else
            pnlOutput.Controls.Add(new LiteralControl("No Entries match your search criteria"));
    }
    catch (Exception ex)
    {
        pnlOutput.Controls.Add(new LiteralControl("<h3>Error</h3>"+ex.Message));
    }
    pnlOutput.Visible = true;
}

First off, you can see that the searchTerms are made by splitting the input into an array, pretty simple.  Next the min/max calories are set to defaults and attempted to be changed to the inputs, and will only be changed if there are valid inputs; it will output an error if minCal > maxCal.

The first leg of the search is pretty simple just return all descendants in the XML file of type recipe that follow the searhc criteria.  First I check the type, if the DropDownList is at SelectedIndex of 0 (which is the 'any' selection) I will return all, otherwise, jsut the ones that equal the selected type; I do the exact same for serves.  Then a simple check that returns those that calMin >= calories <=calMax.

That was not the part that confused me, it was how to get a dynamic number of search terms to be iterated through.  But this is where I cam up with the simplest of solutions.  Just search EACH term alone, and interate through it with Linq and a foreach... the beauty of the IEnumerable.  For each search term in searchTerms I simply run a new Linq statement that checks if the ameElement.Contains(that_search_term).  Therefore, every time this runs through, it will drop all entries that don't contain the term, each iteration [likely] returning fewer entries -- so simple!  But not really, for some reason (still not completely sure why), if I simply run:

foreach (string s in searchTerms)
    searchResults = from p in searchResults where p.Element("name").Value.ToLower().Contains(s.ToLower()) select p;

Does NOT work, it really only honors the last term; it is not overwriting searchResults every iteration.  After some testing, I had found out that I had to make a new variable (that's the filteredResult that we declared above) that would instead take place of searchResults and then we can write over it.  But also, you may not do this inside the foreach loop itself, as that still only honors the last term.  BUT, if the method is taken out and placed seperately, the overwrite seems to work.  But NOT for searchResults, I have to use filteredResults.  If anyone understnad exactly why this extra step is necessary, pleae enlighten me!  Here is how it is called:

filteredResults = searchResults;
foreach (string s in searchTerms) // since name can be multiple words, this iterates through them all, making sure that all of the terms are present
    filteredResults = iterateThroughSearchTerm(s);

And here is the method:

protected IEnumerable<XElement> iterateThroughSearchTerm(string term)
{
    return (from p in filteredResults where p.Element("name").Value.ToLower().Contains(term.ToLower()) select p);
}

All that is left is to simply output your findings, an error message or a 'sorry, nothing found' message.  And there you go, a bunch of different search approaches all covered at once.  Here is some example code to get you started.

 

No comments:

Post a Comment

C# LINQ Joins With SQL

There are  Different Types of SQL Joins  which are used to query data from more than one database tables. In this article, you will learn a...