qwertox a day ago

What's the licensing of this? I wonder if it could be used to improve OSM.

--

To answer my own question

> This base layer of 100mm+ global places of interest ("POI") includes 22 core attributes (see schema here) that will be updated monthly and available for commercial use under the Apache 2.0 license framework.

Found on Simon Willison’s Weblog [0], quoting the official announcement [1]. His page also shows how to use it with Datasette.

[0] https://simonwillison.net/2024/Nov/20/foursquare-open-source...

[1] https://location.foursquare.com/resources/blog/products/four...

  • sp8962 a day ago
    • qwertox a day ago

      Thank you.

      There's an interesting link in that thread to a PMTiles viewer with the data in it:

      https://wipfli.github.io/foursquare-os-places-pmtiles/#map=1...

      • eichin 21 hours ago

        That's really convenient - I zoomed the map to an area in my town, clicked on a place, and even though popup is just raw data, it let me see which fields hold which values (which I could then feed back into pandas search expressions on the parquet files.) Just little things like "locality" is city/town in the US, and "fsq_category_labels is where I'll find Ice Cream Parlor".

    • walterbell a day ago

      Thanks.

      > Foursquare and Overture places are like many geolocation-centric datasets: users aren’t supposed to ever see the raw data, either in a list or on the map. You have to filter by a confidence score. Otherwise, you’ll get tons of user-generated junk – pranks, mistakes, etc.. In the past, Foursquare would charge big bucks for the confidence scores as an upsell. If these scores aren’t part of the dataset, then no wonder the company feels comfortable releasing the data.

eichin a day ago

Ooh, I have an "all ice cream shops in Massachusetts" project for which this would be at least an interesting cross-reference for (the "places humans show up at" bias in foursquare's business works in my favor here.)

Or rather, their former business? https://techcrunch.com/2024/10/22/farewell-to-foursquares-ap... says the user apps go away in less than a month...

  • snthd a day ago

    Here's an overpass turbo query against OpenStreetMap, if anyone was curious:

    https://overpass-turbo.eu/s/1UCX

        [out:json][timeout:25];
        {{geocodeArea:Massachusetts}}->.searchArea;
        (
        nwr["amenity"="ice_cream"](area.searchArea);
        nwr["shop"="ice_cream"](area.searchArea);
        );
          out geom;
    • eichin 21 hours ago

      Neat! I'll have to poke at it and see if I can come up with a usefully broader-but-not-too-noisy search - my personal "obsessively search 'ice cream in $town' for each town" collection has about twice that many individually reviewed locations and I'm nowhere near done collecting. (My "bracketing shot" for this is that mailing list vendors claim they can sell me a list of 700ish ice cream related business addresses - no idea how precise they are! but it suggests my current 370-item list is "getting there".)

tra3 a day ago

Why would foursquare release this dataset? I can't help but try to think of an angle..

  • billfor a day ago

    Most of the stuff in that dataset has apis you can use live. They sent notification that they were turning off citysearch towards the end of this year and beginning of next year. The api behind citysearch was the only way I know that an individual could keep a categorized list of places like bars and restaurants under control. I would take their api and convert it to kml to build my own google map of places without all the google ad crap. As well as having a full featured api it would also mark places as closed so your lists could autoremove the old places and you could sometimes find what took over the location. I would also subcategorize places into favorites, new and notable, hidden places, types of bars, etc….

    I will miss Foursquare citysearch and its predecessor, a little palmos app known as Vindigo. Google and yelp let you tag places in their apps but don’t have as good of api, so going forward it will be hard to maintain a private list of places that can be categorized, rendered, filtered, maintained, and exported. Google and yelp largely keep your poi info captive.

    • marklit a day ago

      To anyone considering going to all this effort, consider doing this work on OpenStreetMap. 50K contributors make OSM a bit better every month but a good map is never finished. https://rapideditor.org/edit

      • RicoElectrico a day ago

        Um, RapID is the big tech's spin on OSM editing. Clicking stuff to import from AI or government data is hardly the essence of OSM, a complement at best.

        Use iD or JOSM on desktop, StreetComplete/Every Door/Go Map!!/Vespucci on the phone. Survey POIs in your local area, with your own feet. Big tech can't do that ;)

  • martinkallstrom a day ago

    The angle I can think of is to preserve the legacy of a dwindling operation and share the value that was created at the peak.

  • jsemrau a day ago

    90% of Foursquares revenue comes from Enterprise clients. This dataset would not cannibalize that revenue, but it would provide the general population by fixing bugs and finding new use-cases that might put them in a better spot when competing with Google Maps, Yelp, and Facebook Places.

BrandiATMuhkuh a day ago

About 10 years ago there was a project called "Sightsmap". It was a heatmap of the most photographed sights in the world.

I really loved the map for planning road trips and city trips.

I would love such a service again. I think OPs data/maps represent basically the same information.

junto a day ago

These are mostly way out of date now right?

  • dzogchen a day ago

    No. Each POI has date properties indicating date added, date closed and last check date. Most POIs are surprisingly up to date.

    See my other comment to explore the dataset.

    • jorams a day ago

      A comment on the OSM community thread notes, and I can confirm based on the map you linked, that it contains many POIs that used to exist but haven't for a while, which nevertheless have a date_refreshed from this year.

      • eichin 7 hours ago

        Looking at the narrow subset I care about (ice cream in massachusetts) they've got a bit of duplication, particularly "vague place and precise place that aren't combined":

        Far Fars, None, Duxbury,

        Farfars Danish Ice Cream, 272 Saint George St, Duxbury, http://farfarsicecream.com

        Georgie Porgie's Ice Cream Factory, 2 Northern Blvd, Newbury,

        Georgie Porgie’s, None, Newbury,

        Goodhile's Country Store, 1122 Wachusett St, Jefferson,

        Goodhile’s Ice Cream, None, Jefferson,

        plus some places with old names but current URLs ('Winterbottom Ice Cream', 'http://www.perryslaststand.com' - which also has an entry with the same street address and the new name, both have 'date_refreshed': '2024-10-15'; the store has always been Perry's, the family name is just the LLC behind it - feels like the result of a sloppy dataset merge.)

        • eichin 6 hours ago

          Hand inspected 30 items[^sample] and found 10 real ice cream shops, 9 "exists but closed at least a year ago", and 6 "couldn't even find a vague match". (Also 2 real frozen custard shops, which for my purposes don't count but if you're judging generic "retail business data quality" they're probably legit.)

          Over the "Ice Cream Parlor" data subset, only 171 records even have a date_closed (a little over 10%); of the 9 I identified as closed, only 1 had a date_closed field, which roughly checks out.

          [^sample]: US, MA, has "Dining and Drinking > Dessert Shop > Ice Cream Parlor" as a label, which is about 1300 items; sorted by name and picked the first 30 - not a random sample.

tech234a a day ago

Article has a minor typo: it reads "The US has ~23.5M records followed by Indonesia and Turkey with over 80M each" but the second figure should be 8M.

  • marklit a day ago

    Fixed. Thanks for spotting that.

tipiirai a day ago

I thought Foursquare no longer exists

FollowingTheDao a day ago

I started reading the article but stopped out of sheer jealousy of that guys PC rig:

"I'm using a 6 GHz Intel Core i9-14900K CPU. It has 8 performance cores and 16 efficiency cores with a total of 32 threads and 32 MB of L2 cache. It has a liquid cooler attached and is housed in a spacious, full-sized, Cooler Master HAF 700 computer case. I've come across videos on YouTube where people have managed to overclock the i9-14900KF to 9.1 GHz.

The system has 96 GB of DDR5 RAM clocked at 6,000 MT/s and a 5th-generation, Crucial T700 4 TB NVMe M.2 SSD which can read at speeds up to 12,400 MB/s. There is a heatsink on the SSD to help keep its temperature down. This is my system's C drive."

  • eichin 21 hours ago

    That does sound glorious, but I didn't have any trouble loading individual parquet files into pandas on a 3 year old Thinkpad X1...

wslh a day ago

Am I mistaken, or are we now at a data inflection point? As a frustrated consumer of POIs (e.g. Google Maps), I suspect that Foursquare understands their real position of power is no longer in the data itself (since many businesses are now doing the same) but in owning the last mile of the user experience. From a business perspective, we can create countless sites using this data, but that alone won’t significantly move the needle.