Add Autoampel scraping functionality and enhance vehicle data processing

This commit is contained in:
2026-03-05 00:44:00 +03:00
parent c7750ac4ca
commit 223da27094
4 changed files with 1086 additions and 7 deletions
+21
View File
@@ -21,6 +21,12 @@ Scrape all brand pages:
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj > hsntsn.csv
```
Scrape directly from Autoampel typklassen pages (no hsn-tsn redirect chain):
```bash
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --source autoampel > hsntsn.csv
```
Scrape only specific queries from `stdin`:
```bash
@@ -38,3 +44,18 @@ Repair only missing year fields from an existing CSV:
```bash
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --repair-years --input-csv hsntsn.csv --output-csv hsntsn.repaired.csv
```
Merge core fields by `HsnTsn` and write to PostgreSQL (priority: `hsn-tsn.de` then `autoampel.de`):
```bash
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --merge-core-db --pg-connection "Host=localhost;Port=5432;Database=hsntsn;Username=hsntsn;Password=hsntsn" --pg-table public.hsntsn_vehicle
```
You can also pass the connection via environment variable:
```bash
export HSNTSN_PG="Host=localhost;Port=5432;Database=hsntsn;Username=hsntsn;Password=hsntsn"
dotnet run --project src/HsnTsnScraper/HsnTsnScraper.csproj -- --merge-core-db
```
Optional: if you already have a CSV, you can still seed from it with `--input-csv hsntsn.csv`.