Welcome Guest, Not a member yet? Register   Sign In
MYSQL Searching
#1

[eluser]shankar ganesh[/eluser]
I would like to know how to build query in MySql for search option.

For eg, the skills table contains records like

id skills
1 PHP, MySql, Java
2 MySql, ASP.NET
3 Java, Javascript
4 Sharepoint, Jquery, PHP
5 ASP, mssql
6 javascript


when i enter search key like "PHP,Java" in textbox, the result should fetch the matching id's like,

1,3,4,6 (it should fetch javascript also)

if search key is "javascript" the resulting ids should be

1,4 (it should fetch java also)

if search key is "MySql" the resulting ids should be

1,2,5 (it should fetch mssql also)
#2

[eluser]Jondolar[/eluser]
Put wildcard characters around your search word:
WHERE skills LIKE %PHP%

If you have multiple words:
WHERE skills LIKE %PHP% AND skills LIKE %JAVA%

Alternatively, look into FULLTEXT indexing.
Alternatively, (and possibly most correctly), normalize your database. Create a skills_users table that has the userid and the skill, one per skill
id skill
1 php
1 mysql
1 java
2 mysql
2 asp

Then, juse a join when searching. You can use the DISTINCT keyword to return distinct records.

You can even normalize the above more by linking to a skills table that just lists skills, then your skills_users table would have an index to the user and an index to the skill (one record for each skill).

Good luck.
#3

[eluser]shankar ganesh[/eluser]
its fine,
but when more number of users exists a lot of duplicate records may exists.
How to resolve this?
#4

[eluser]sdmagic[/eluser]
I agree with Jondolar, Shankar. You need to normalize your tables.

I would set it up with the following 3 tables:
Code:
users
---------------------
userid    INT PK
lastname  VARCHAR(45)
firstname VARCHAR(45)

userlanglink
---------------------
userlanglinkid INT PK
userid         INT
langid         INT

lang
--------------------
langid   INT PK
language VARCHAR(45)

One user per row in users, one language per row in lang and multiple links in userlanglink:

Code:
users
----------
1 Doe John
2 Doe Jane

userlanglink
------------
1 1 1
2 1 3
3 2 2
4 2 3

lang
----------
1 Assembly
2 Basic
3 C++

Then to find out which languages a user knows, the SQL is:

Code:
SELECT users.firstname, users.lastname, lang.language
       FROM users LEFT OUTER JOIN userlanglink ON users.userid = userlanglink.userid
                  LEFT OUTER JOIN lang ON userlanglink.langid = lang.langid
       WHERE users.firstname = 'first name searching for' AND
             users.lastname  = 'last name searching for';

To find out which users know a particular language:

Code:
SELECT users.firstname, users.lastname, lang.language
       FROM users LEFT OUTER JOIN userlanglink ON users.userid = userlanglink.userid
                  LEFT OUTER JOIN lang ON userlanglink.langid = lang.langid
       WHERE lang.language = 'whatever language searching for';

Of course you can use AND and OR in the WHERE clause in the above SQL to include other languages.

Using foreign keys and careful programming a normalized system is the best way to go. Do not worry about duplication
in a properly normalized system. There is no duplication in the 3 tables I've listed above.
#5

[eluser]sdmagic[/eluser]
Shankar,

As far as selecting multiple search keys, I would not let the user just type them into a text box. The parsing and validation horrors that creates are too terrible to contemplate. I also would not assume that if they search for Java, they always also want Javascript. I would code it using a Multi Select input:

Code:
<select name=mytextarea size=10 multiple>
    <option name=Asm value=asm>Assembly</option>
    <option name=Bas value=bas>Basic</option>
    <option name=Cpp value=cpp>C++</option>
</select>

It would be best to populate the select box using the data from the lang table. Then, your form is always up to date and in sync with your table.

With this scheme, the user has control over search keys and you have an easy to parse and validate input. You just build your WHERE clause based on the keyword(s) that were given.

Steve
#6

[eluser]shankar ganesh[/eluser]
Thanks fine, Actually this is for job portal.

Where multiple users enter multiple key skills in textbox what they known. In this case
using dropdown is not feasible bcos for this we should predict all the languages at first. But in case if the jobseeker is an non IT field then we should provide only textbox.

According to using another separate language table when user enters key skills we should validate whether duplicate is exists, its ok no problem. But when users(jobseekers) grows, a huge amount of skills will be added in language table. When submitting form it takes more time for searching duplication in language table and fetching id from language table. How to solve this problem?
#7

[eluser]nuwanda[/eluser]
Why not just have your users tick checkboxes for each language?

Then it's easier to add the values together in a search query. You don't have to worry about errors.

Further, if you add new languages you can just add extra checkboxes.

Control the user input as much as possible.
#8

[eluser]shankar ganesh[/eluser]
Actually if a jobseeker is an

1. IT person enter their skills like ASP.NET,JAVA,PHP,MSSQL,MYSQL,ORACLE,FLASH
2. Electronics Person enter their skills like, DSP,PROTEL,COMMUNICATION,VLSI,EMBEDDED

we can't able to predict other sector person's skills its very difficult to get skills regarding all sectors in the world.

so we have to give option for entering their skills.
#9

[eluser]sdmagic[/eluser]
In my previous posts I was talking about searching for exisiting records. I also thought you were talking specifically about computer languages. For assorted skills, just change "lang" to "skills" and "language" to "skill" in my above posts.

For new entries, you need to validate each field for blanks and stupid input. Validating for a blank field is easy, validating for stupid input is hard. We all know how to validate a URL, an email address, a phone number, zip and postal codes, etc. Validating other kinds of free form text is a different kettle of fish. You can use Soundex routines to make an attempt at catching spelling errors (although this is hit and miss). You can build lookup tables that contain text you don't allow. For example, profanity probably shouldn't be allowed on most business sites. You should have lookup tables for cities, states and countries, etc. You can include zip and postal code tables that validate city, state and country data. Some companies hold new records in temporary tables so that a human can evaluate and accept or reject the data. It's up to you and your company or client. My experience has taught me that you can never validate too much. Dirty data accumulates over the years and causes a lot of maintenance problems.

Once all the validation is completed, I always search before inserting new rows.

So, to update the database with user name and skill data I would (You would structure things differently if you only allow new users and not updates to an existing user):

Code:
Display some eye-candy so the user doesn't think we've died (search for a flight or hotel on a travel site to see what I mean here).
Parse the skills input to separate each skill
Validate the input (building error messages as we progress)
If all fields and skills are valid
    Search for the user name
    If the user name doesn't exist
        Insert the new user
    Get the user ID number
    For each skill
        Search for the skill
        If the skill doesn't exist
            Insert the new skill into the skills table
        Get this skill's ID
        Insert the user ID and this skill's ID into the link table
    Display a success message
Else
    Dislay all error messages we've accumulated

I demonstrated how to avoid duplicate data in this and my previous posts. You mentioned that you were worried about speed. I wouldn't fret too much over it. Try it first and see how the system works. Carefully plan your indexes and your caching. I'm pretty sure your skillset table will be quite small -- Probably less than 1,000 rows which is very tiny (I code a lot of database systems where the tables have hundreds of thousands of rows). There just aren't that many skills in the world even when you include multiple human languages. If your user base grows to hundreds of thousands or millions, you will be able to afford to scale up your hardware and move to clusters and such.

Steve




Theme © iAndrew 2016 - Forum software by © MyBB