View Single Post
  #12   Report Post  
Old 02-07-2007, 03:50 PM posted to rec.gardens
[email protected][_2_] mfranklin@msfxdesigns.com[_2_] is offline
external usenet poster
 
First recorded activity by GardenBanter: Jul 2007
Posts: 7
Default Downloadable Horticulture database?

On Jul 2, 9:46 am, "JoeSpareBedroom" wrote:
wrote in message

ups.com...



On Jul 1, 5:39 pm, "JoeSpareBedroom" wrote:
wrote in message


groups.com...


On Jul 1, 2:34 pm, "JoeSpareBedroom" wrote:
Does the USDA file put genus, species and varietal name in 3 different
fields?


wrote in message


groups.com...


Thanks for the posts. I have found a few. I had already looked at
the USDA csv, but it's over 200k plants, many of which are weeds or
food crop. It would be rather time consuming to go through and pick
out the plants significant to landscaping.


I'll check out the links you guys gave. Most of the stuff I have
found, people want to sell their database. I'm willing to trade the
information with anyone else who has info, so it's not like I'm only
looking to receive.


Thanks again for all the leads.


MIchael


On Jun 29, 4:07 pm, wrote:
Does anyone know where I can get a csv file of a
horticulturedatabase
for a project I'm working on? I'm trying to create an easy-to-use
onlinedatabaseto help me source plants for my design projects. The
project iswww.landscapedesignexchange.com.


Any leads on data would be appreciated.


Thanks.


Michael


No. It puts it all in one field. So after having to sort thru 200k
of plants to find out the ones significant to landscaping, I'd have to
transform it. After all that, it is missing all of the relevant
criteria to select a plant for a landscape project. It seems to me
that it is more trouble than it's worth just to get the scientific/
common name and nothing else.


You should be able to parse the words into separate columns easily.
Assuming, for example, that you have things like this in one field:


Pieris japonica


(all words separated by spaces, in other words), do a search-replace,
converting all spaces into some useless character like the tilde ~. Then,
use the text to columns thing with the tilde as the delimiter. Voila.


I understand parsing text files to csv and then flipping it to a
database. I've worked on quite a few databases as my day job is as a
computer geek (I'm only a plant geek by night/weekends). The major
problem with the USDA csv is that it contains a massive amount of
irrelevant data. Beyond that, not all plants fall easily into the
example Genus Species Cultivar/Var. For example, Abelia ' Edward
Goucher' omits species and has 2 words for cultivar. I've been doing
quite a bit of clean-up on the csv files I've been able to get. I'm
just dreading tackling the USDA csv with 200k+ entries then
eliminating tens of thousands of irrelevant entries and then cleaning
up 40-60% that don't easily convert.


Then I end up with tens of thousands of relevant plants with zero
information about them. Over time as I use them in design projects I
can add data, but I'm still forced to go outside my database to pull
relevant information.


I know this is going to be a massive undertaking and an on-going
project. I appreciate all the suggestions and if anyone is interested
in trading content, get with me athttp://www.landscapedesignexchange.com.
Thanks again.


I do this all day long with grocery data that was assembled by slobs. If I
could reach through the phone and grab some of these people by the
throat......

Interested in splitting the task? You take half the file and I take the
other? Reassemble it later?


lol . . . that's the way I feel when I'm working on data people send
me at work. If I could just get my hands on them for 5 minutes . . .

I'd be grateful to split the task. I'd also be more than happy to
give you a copy of the entire database at the end if you have any use
for it.