I would like to know how popular a name is in the United States, preferably by rank, but the number of people with that name will also be good.
The Social Security Administration keeps records of children's names starting in 1879 . There is probably a way to determine the overall frequency of the name in the population, but I would agree to get the name rank in a given year and use this as a (erroneous) proxy for popularity.
This is possible through their website , so I assume that only parsing the results of the correct POST request will do this.
I am currently just running:
curl -d "year=2010&top=1000&number=p" http://www.ssa.gov/cgi-bin/popularnames.cgi > 2010_top_1000.html
And then parsing html and doing a search in the resulting file.
Is there a better way to do this?
Update: Most of the names that you can get using the above method are 1000. You can get the whole list of names of children with a frequency of more than 5 names as a zip file here: http://www.ssa.gov/oact/babynames/limits .html
source share