开发者

Writing an optimised and efficient search engine with mySQL and ColdFusion

开发者 https://www.devze.com 2022-12-22 23:07 出处:网络
I have a search page with the following scenarios listed below. I was told using so many conditional statements degrades performance, but I don\'t know if any other way to achieve the same objective.

I have a search page with the following scenarios listed below. I was told using so many conditional statements degrades performance, but I don't know if any other way to achieve the same objective.

  1. Search.cfm will processes a search made from a search bar present on all pages, with one search input (titleName).
  2. If search.cfm is accessed manually (through URL not through using the simple search bar on all pages) it displays an advanced search form with three inputs (titleName, genreID, platformID (dynamically generated, but not in this example code) or it evaluates searchResponse variable and decides what part of the page to show.
    1. If simple search/advanced search query is blank, has no results, or less than 3 characters it displays an error
    2. If any successful search returns results, they come back normally. There is one function handling the queries.

The top-of-page logic is as follows:

    <!---DEFINE DEFAULT STATE--->
<cfparam name="variables.searchResponse" default="">

<!---CHECK TO SEE IF SEARCH A FORM WAS SUBMITTED AND EXECUTE SEARCH IF IT WAS--->
<cfif IsDefined("Form.simpleSearch") AND Len(Trim(Form.titleName)) LTE 2>
    <cfset variables.searchResponse = "invalidString">
<cfelseif IsDefined("Form.simpleSearch") AND Len(Trim(Form.titleName)) GTE 3>
    <cfinvoke component="gz.cfcomp.search" method="searchTitles" titleName="#Form.titleName#" genreID="" platformID="" returnvariable="searchResult">
    <cfset variables.searchResponse = "hasResult">
</cfif>

<!---CHECK IF ADVANCED SEARCH FORM WAS SUBMITTED--->
<cfif IsDefined("Form.advancedSearch") AND (Len(Trim(Form.titleName)) LTE 2 AND Len(Form.genreID) IS 0 AND Len(Form.platformID) IS 0)>
    <cfset variables.searchResponse = "invalidString">
<cfelseif IsDefined("Form.advancedSearch") AND variables.searchResponse IS NOT "invalidString">
    <cfinvoke component="gz.cfcomp.search" method="searchTitles" returnvariable="searchResult" titleName="#Form.titleName#" genreID="#Form.genreID#" platformID="#Form.pl开发者_如何学运维atformID#">
    <cfset variables.searchResponse = "hasResult">
</cfif>

<!---CHECK IF ANY RECORDS WERE FOUND--->
<cfif IsDefined("variables.searchResult") AND searchResult.RecordCount IS 0>
    <cfset variables.searchResponse = "noResult">
</cfif>

I'm using the searchResponse variable to decide what the the page displays, based on the following scenarios:

<!---ALWAYS DISPLAY SIMPLE SEARCH BAR AS IT'S PART OF THE HEADER--->
<form name="simpleSearch" action="search.cfm" method="post">
<input type="hidden" name="simpleSearch" />
<input type="text" name="titleName" />
<input type="button" value="Search" onclick="form.submit()" />
</form>

<!---IF NO SEARCH WAS SUBMITTED DISPLAY DEFAULT FORM--->
<cfif searchResponse IS "">
    <h1>Advanced Search</h1>
    <!---DISPLAY FORM--->
    <form name="advancedSearch" action="search.cfm" method="post">
        <input type="hidden" name="advancedSearch" />
        <input type="text" name="titleName" />
        <input type="text" name="genreID" />
        <input type="text" name="platformID" />
        <input type="button" value="Search" onclick="form.submit()" />
    </form>
</cfif>

<!---IF SEARCH IS BLANK OR LESS THAN 3 CHARACTERS DISPLAY ERROR MESSAGE--->
<cfif searchResponse IS "invalidString">
    <cfoutput>
        <h1>INVALID SEARCH</h1>
    </cfoutput>
</cfif>

<!---IF SEARCH WAS MADE BUT NO RESULTS WERE FOUND--->
<cfif searchResponse IS "noResult">
    <cfoutput>
        <h1>NO RESULT FOUND</h1>
    </cfoutput>
</cfif>

<!---IF SEARCH MADE AND RESULT WAS FOUND--->
<cfif searchResponse IS "hasResult">
    <cfoutput>
        <h1>Search Results</h1>
    </cfoutput>
    <cfoutput query="earchResult">
        <!---DISPLAY QUERY DATA--->
    </cfoutput>
</cfif>

Is my logic a) not efficient because my if statements/is there a better way to do this? And b) Can you see any scenarios where my code can break? I've tested it but I have not been able to find any issues with it. And I have no way of measuring performance. Any thoughts and ideas would be greatly appreciated.

Here is my function, for reference:

<!---SEARCH--->
<cffunction name="searchTitles" hint="This functions searches for a title based on multiple categories" access="public" output="false">
    <cfargument name="titleName" required="no" type="string" hint="Search by title">
    <cfargument name="genreID" required="no" type="string" hint="Search by genre">
    <cfargument name="platformID" required="no" type="string" hint="Search by platform">
    <!--- DEFINE LOCAL VARIABLES - NOTE VARIABLE NAME IS QUERY NAME --->
    <cfset var searchResult = "">
    <!---GET RESULTS--->
    <cfquery name="searchResult" datasource="myDSN">
        SELECT
            games.gameID,
            games.gameReleaseDate AS rDate,
            titles.titleName AS tName,
            titles.titleShortDescription AS sDesc,
            platforms.platformName AS pName,
            genres.genreName AS gName
        FROM
            games
            INNER JOIN titles ON titles.titleID = games.titleID
            INNER JOIN platforms ON games.platformID = platforms.platformID
            INNER JOIN genres ON games.genreID = genres.genreID
        WHERE
            0=0
            <cfif ARGUMENTS.titleName IS NOT "">
                AND titles.titleName LIKE <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ARGUMENTS.titleName#%">
            </cfif>
            <cfif ARGUMENTS.genreID IS NOT "">
                AND games.genreID = <cfqueryparam cfsqltype="cf_sql_varchar" value="#ARGUMENTS.genreID#">
            </cfif>
            <cfif ARGUMENTS.platformID IS NOT "">
                AND games.platformID = <cfqueryparam cfsqltype="cf_sql_varchar" value="#ARGUMENTS.platformID#">
            </cfif>
        ORDER BY
            rDate DESC,
            tName;
    </cfquery>
    <cfreturn searchResult>
</cffunction>

Many thanks


If you add default values of "" to your genreID and platformID arguments, then I think you can refactor your top code to:

<cfif StructKeyExists(url, "titleName") and Len(Trim(url.titleName)) lte 2>
    <cfset variables.searchResponse = "invalidString">
<cfelseif StructKeyExists(url, "titleName")>
    <cfinvoke component="gz.cfcomp.search" method="searchTitles" returnvariable="searchResult" argumentcollection="#url#">
    <cfset variables.searchResponse = "hasResult">
</cfif>

<cfif searchResponse eq "hasResult" and searchResult.recordCount eq 0>
    <cfset variables.searchResponse = "noResult">
</cfif>

Note: I recommend switching your form methods to "get" for a search like this to improve the user's experience navigating your site. I've switched all the form references to url in the code.


Have you thought about using a Full-Text search rather than just specific things like title, genre and platform? Full text is much faster that a normal table query at the cost of a bit more disk space usage, and allows very flexible Google style searches on all your data. Have a look at the link below.

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

Perhaps if you did this, you could just implement a pass though user input to the Full text search, since it provides all the advanced search expressions through it's own syntax. By doing this you probably would not need to branch your front end logic based on what they are searching for, this would make the solution cleaner and faster.


Just a preference I suppose, but since the advent of Google, people expect the one search box to do everything. You could provide some advanced search capabilities directly from the simple box by testing the input and conditionally running separate queries and returning the combined results. To make the results clearer you can highlight the matching portion with Coldfusion or javascript.

I would also init the component rather than using cfinvoke multiple times, but I'm not sure if that's a requirement or a preference.

<!--- By setting a default, we don't have to test for this var's existence repeatedly --->
<cfparam name="form.simpleSearch" default="" />

<cfscript>
// Using cfscript for compactness in this little box, could use tag based cf also

// Init myComponent so that it's ready to receive searches
searcher = createObject('component','myComponent');    

// init query results
simpleResult = queryNew("id");
genres = queryNew("id");
platforms = queryNew("id");
titles = queryNew("id");

// q: shorthand 'Query' var that's trimmed and prechecked for length
q = "";
if (len(trim(form.simpleSearch)) GTE 3) {
    q = trim(form.simpleSearch);
}

// Run simple search - Should returns a query object regardless of result count
if (len(q) {
    simpleResult = searcher.simpleSearch(q);

    /* Not sure what the difference between this and simpleSearch() is unless
    this only looks for exact matches, and simpleSearch() uses LIKE and %
    wildcards for looser results. */
    titles = searcher.titleSearch(q);

    /* Conditionally run advanced searches
       - assumes that genreID and platformID are numeric
       - IF genreID and platformID are not numeric, you could use regex to test for their signature or just run them anyway knowing they wouldn't match in many cases. (BUT you must, must MUST use cfqueryparam if you attempt that)
       - Requires that advancedSearch is split into three separate functions
       - Alternately, the simpleSearch(q) or advancedSearch() could do this all internally
       - All of these functions should return an empty query if no results are found
    if (isNumeric(q)) {
        genres = searcher.genreSearch(q);
        platforms = searcher.platformSearch(q);
    }
}

</cfscript>

...

<!---ALWAYS DISPLAY SIMPLE SEARCH BAR AS IT'S PART OF THE HEADER--->
<cfoutput>    
    <form action="#cgi.script_name#" method="post">
    <input type="text" name="simpleSearch" value="#form.q#" />
    <input type="submit" value="Search" name="submit" />
</form>

<cfif titles.recordcount>
    <h3>Matching Titles</h3>
    <cfloop query="titles">
        ..html to output the titles
    </cfloop>
</cfif>

<cfif genres.recordcount>
    <h3>Matching Genres</h3>
    <cfloop query="genres">
        ..html to output the genres
    </cfloop>
</cfif>

<cfif platforms.recordcount>
    <h3>Matching Platforms</h3>
    <cfloop query="platforms">
        ..html to output the platforms
    </cfloop>
</cfif>

<cfif simpleResult.recordcount>
    <h3>Results</h3>
    <cfloop query="simpleResult">
        ..html to output the simpleResult
    </cfloop>
</cfif>

</cfouput>

Granted, there are many, many different ways to do this. This is simply one method to reduce the complexity of a simple search vs an advanced search. The advanced results would only be shown if they had a match and you can tune those results with your query inside your myComponent function - just remember to return a blank query if there is no result.


My coworkers told me, that we should simplify the CFIF-Statements:

From 'CFIF something EQ ""' we moved to 'CFIF Not Len(something)' and we work more with booleans for NoResult and HasResult we would use a boolean.

Anyway, the other answers are some great inspiration, I need to check my own search page ;)

0

精彩评论

暂无评论...
验证码 换一张
取 消