Dataset Profiles

Data profiling information about the published datasets on the DataSF open data portal

The source code that generates this dataset can be found at:

https://github.com/DataSF/datasf-profiler


Publishing Department: Mayor

Dataset Link: https://data.sfgov.org/d/w6q6-i3uv

Field Name (opt. alias)Field DefinitionAPI Key
Blob CountNumber of blob fields in a datasetblob_count
Boolean CountNumber of boolean fields in a datasetboolean_count
CategoryThe semantic category that the dataset is associated withcategory
Created DateDate the dataset was createdcreated_date
Data Change FrequencyA time period that indicates how often the data in the dataset changesdata_change_frequency
Dataset NameThe name of the datasetdataset_name
datasetIDThe four by four of the datasetdatasetid
Days Since First CreatedThe number of days between today and the date when the dataset was first createddays_since_first_created
Days Since Last Updated The number of days between today and the date when the dataset was last updateddays_since_last_updated
DepartmentThe City department that is responsible for the datasetdepartment
DescriptionDescription for the datasetdescription
Documented CountCount of the number of fields that have been documented and have completed field definitionsdocumented_count
Documented PercentageShows the percentage of fields in the dataset that have been documenteddocumented_percentage
DownloadsNumber of downloads as reported by Socratadownloads
Dupe Record CountThe number of row level duplicates- aka rows that are exact copiesdupe_record_count
Dupe Record PercentIndicates what percentage of the dataset is compromised of level duplicatesdupe_record_percent
Field CountThe total number of fields in a datasetfield_count
Global Field CountThe number of fields in the dataset mapped to a global field alias and definitionglobal_field_count
Global Field PercentageShows the percentage that global fields make up the dataset; A dataset with a high percetnge of global fields may indicate that its a reference datasetglobal_field_percentage
KeywordsAssociated tags or keywords to help in classifying for search.keywords
Last Updt Dt DataDate that a dataset was lasted updatedlast_updt_dt_data
Line CountNumber of geometry: line fields in a datasetline_count
Multiline CountNumber of geometry: multiline fields in a datasetmultiline_count
Multipoint CountNumber of multipoint fields in a datasetmultipoint_count
Multipolygon CountNumber of geometry: multipolygon fields in a datasetmultipolygon_count
nbeIDNew backend ID. This id corresponds to the SODA API 2.1 API endpoint for this dataset.nbeid
NotesNotes about the datasetnotes
Numeric CountNumber of numeric fields in a datasetnumeric_count
Point CountNumber of point fields in a datasetpoint_count
Polygon CountThe number of geometry: polygon fields in a datasetpolygon_count
Profile Last Updt DtThe date that the dataset was last profiledprofile_last_updt_dt
Publishing FrequencyThe time period in which the dataset should be (re)publishedpublishing_frequency
Publishing HealthThe publishing health indicates whether or not a dataset is being updated as specified by its publishing schedule\n\nOn Time- indicates that dataset updated on time\n\nDelayed-- indicates that dataset is late to update\n\nStale - indicates that dataset has not been updated in more than two times the time period indicated in publishing frequencypublishing_health
Record CountThe total number of records in a datasetrecord_count
rowIdentifierpkid for a dataset if exisitsrowidentifier
rowLabeltext for what a row in the dataset representsrowlabel
Text CountNumber of text fields in a datasettext_count
Time CountNumber of time fields in a datasettime_count
Timestamp CountNumber of timestamp fields in a datasettimestamp_count
VisitsNumber of user visits to the dataset as reported by Socratavisits