Add Autoampel scraping functionality and enhance vehicle data processing
This commit is contained in:
@@ -21,6 +21,12 @@ Scrape all brand pages:
|
||||
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj > hsntsn.csv
|
||||
```
|
||||
|
||||
Scrape directly from Autoampel typklassen pages (no hsn-tsn redirect chain):
|
||||
|
||||
```bash
|
||||
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --source autoampel > hsntsn.csv
|
||||
```
|
||||
|
||||
Scrape only specific queries from `stdin`:
|
||||
|
||||
```bash
|
||||
@@ -38,3 +44,18 @@ Repair only missing year fields from an existing CSV:
|
||||
```bash
|
||||
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --repair-years --input-csv hsntsn.csv --output-csv hsntsn.repaired.csv
|
||||
```
|
||||
|
||||
Merge core fields by `HsnTsn` and write to PostgreSQL (priority: `hsn-tsn.de` then `autoampel.de`):
|
||||
|
||||
```bash
|
||||
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --merge-core-db --pg-connection "Host=localhost;Port=5432;Database=hsntsn;Username=hsntsn;Password=hsntsn" --pg-table public.hsntsn_vehicle
|
||||
```
|
||||
|
||||
You can also pass the connection via environment variable:
|
||||
|
||||
```bash
|
||||
export HSNTSN_PG="Host=localhost;Port=5432;Database=hsntsn;Username=hsntsn;Password=hsntsn"
|
||||
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --merge-core-db
|
||||
```
|
||||
|
||||
Optional: if you already have a CSV, you can still seed from it with `--input-csv hsntsn.csv`.
|
||||
|
||||
Reference in New Issue
Block a user