• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Input conformation using generic table?

#1
[eluser]erik.brannstrom[/eluser]
On the website I'm currently working on I let users input their own information which is then stored in a database. However, people have different ways to input this information, for example by using or not using a leading 'The' or perhaps adding ', The' at the end.

Instead of forcing people into a specific pattern, I would now like to implement some type of system that automatically links the different values to one single generic value.

My idea is to carry one table with an id, name and generic_id where the generic_id is linked to the id of a generic table with id, name and url_name (for SEO friendly urls). As the input form is posted I automatically check for some common differences like the ones in my example to see if I can find an existing generic row, otherwise I add a new one.

Any comments or suggestions on this approach?

#2
[eluser]xwero[/eluser]
Why not search for the 'The ' and ', The' part and remove it as it should reference to the same thing. If you add a field to the table where you store if you removed the prefix/suffix you can display it any way you want on the site making the list of things more consistent.

#3
[eluser]erik.brannstrom[/eluser]
Problem is I might want to expand this with basics such as A and An, but also allow alternate names (e.g. ELO and Electric Light Orchestra). If my problem is widened like this, I need a bit more advanced solution.

#4
[eluser]eilrahc[/eluser]
Google has entire departments devoted to the research and implementation of this very problem. Smile Because you're essentially trying to normalize the use of a natural language, which is very hard and will probably remain a difficult issue until we have AI to do it for us.

I'd say the easiest way of dealing with it is to impose restrictions on those entering the data. Let them know that "The" or "A" should be omitted from the beginning or end of the string and tell them to avoid abbreviations. And then implement a very good search engine. If the data *really* matters, implement a queue so that human beings can review it before flagging it for permanent inclusion in the database. You can try using tags or keywords (like Flickr does) too.

If you tell us a bit more about what you're trying to do, somebody might be able to offer more/better suggestions, but chances are good that any reasonable solution will just have to be "good enough".

#5
[eluser]erik.brannstrom[/eluser]
I had almost forgotten about this thread.. Anyway, thanks for your input! It does feel comforting that I share this problem with Google Wink

The app I'm building is basically a simple record organizer. Users input their records and can easily show the collection to others and sort on artist / title / year. Imposing restrictions on the user would in this case be something like "make sure you use the correct name and spelling" which probably will be pointless, as most users will assume they are right anyway. In all fairness, it's not always that easy to know if a band is called for example Arcade Fire or The Arcade Fire.

Reviewing every single record added is not a possibility either. Though there are only about 100 users, it is not uncommon with hundreds of records being added in a single day, which is quite a lot to review considering this is just something I do on the side of full-time studies and part-time work Smile

I do realize that eilrahc is right that a solution can only be "good enough". Before I start digging in to this problem, I'm just curious to see if someone else has done something similar or have other suggestions?

#6
[eluser]ehicks727[/eluser]
[quote author="erik.brannstrom" date="1219602636"]I had almost forgotten about this thread.. Anyway, thanks for your input! It does feel comforting that I share this problem with Google Wink

The app I'm building is basically a simple record organizer. Users input their records and can easily show the collection to others and sort on artist / title / year. Imposing restrictions on the user would in this case be something like "make sure you use the correct name and spelling" which probably will be pointless, as most users will assume they are right anyway. In all fairness, it's not always that easy to know if a band is called for example Arcade Fire or The Arcade Fire.

Reviewing every single record added is not a possibility either. Though there are only about 100 users, it is not uncommon with hundreds of records being added in a single day, which is quite a lot to review considering this is just something I do on the side of full-time studies and part-time work Smile

I do realize that eilrahc is right that a solution can only be "good enough". Before I start digging in to this problem, I'm just curious to see if someone else has done something similar or have other suggestions?[/quote]

This is a common problem, actually. I run into this with State/County/City issues all the time.

The answer is to LOCK them into picking via a combo box (I guess it would be a select box.. whatever). Anyway, you'd need to obtain (scrape, use an API, whatever) a list of all bands, and albums, if you can get them. Then you can go about it two ways... you can have them pick from a list, or you can have them search and give them possible results. But in the end, they pick from a pre-formatted list, not enter into a text input box. Make sense??

Maybe there's an API you can tap into with the CDDB or somewhere. Or maybe you can scrape a list of bands and albums from somewhere.

This would be a great AJAX solution... you start typing "Meta" and it gives you 'suggestions', one of them being Metallica. Just like iTunes.

#7
[eluser]Jamie Rumbelow[/eluser]
Your best bet is to just store it in the database, then use a preg_match to validate when needed.


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2019 MyBB Group.