Add Ookla Speedtest Dataset#188
Open
mrsanford wants to merge 7 commits intoaiddata:masterfrom
Open
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for the Ookla Speedtest dataset by introducing new scripts for data ingestion, processing, and raster generation, along with configuration updates and dependency adjustments.
- Updated dependency list in pyproject.toml to align with required packages for the new dataset.
- Added multiple Python modules for downloading, processing, and outputting raster layers from Ookla Speedtest data.
- Included a new configuration file and a main script that integrates the workflow.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Dependency updates to support the new dataset ingestion and processing. |
| datasets/ookla_speedtest/src/transform_populate.py | Implements functions to parse parquet files and generate sparse raster arrays. |
| datasets/ookla_speedtest/src/make_geopackage.py | Provides an optional utility to export GeoDataFrames as GeoPackages. |
| datasets/ookla_speedtest/src/helpers.py | Defines constants and paths used across the new dataset processing workflow. |
| datasets/ookla_speedtest/src/generate_raster.py | Contains functions to define raster profiles and write multiband raster data. |
| datasets/ookla_speedtest/src/download_dataset.py | Implements functions to download dataset files from S3 using boto3. |
| datasets/ookla_speedtest/main.py | Orchestrates the download and processing pipeline for Ookla Speedtest data. |
| datasets/ookla_speedtest/config.toml | Provides configuration parameters and run settings for the dataset integration. |
| # Doing the actual donwloading; calling the S3 client, and putting the S3 filenames together | ||
| def download_files(year: int, quarters: dict = QUARTERS) -> None: | ||
| """ | ||
| Donwnloads the performance data files from the target Ookla S3 bucket for 1 year to a local directory |
There was a problem hiding this comment.
The word 'Donwnloads' is misspelled. It should be corrected to 'Downloads'.
Suggested change
| Donwnloads the performance data files from the target Ookla S3 bucket for 1 year to a local directory | |
| Downloads the performance data files from the target Ookla S3 bucket for 1 year to a local directory |
| if output_path.exists() and not self.overwrite_processing: | ||
| logger.info(f"Processed layer exists: {output_path}") | ||
| else: | ||
| logger.info(f"Processing file: {input_path}. Ouput will be saved to: {output_path}") |
There was a problem hiding this comment.
The word 'Ouput' is misspelled. It should be corrected to 'Output'.
Suggested change
| logger.info(f"Processing file: {input_path}. Ouput will be saved to: {output_path}") | |
| logger.info(f"Processing file: {input_path}. Output will be saved to: {output_path}") |
Member
|
I didn't mean to summon the AI spellchecker 😂 |
Author
Haha no worries. I made some final changes for now, but let me know if I need to fix/update anything. Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contents contain scripts for ingesting, processing, and outputting raster layers for Ookla Speedtest + applying it to the class.