Skip to content

Configure looser import spec for bills#56

Merged
hancush merged 2 commits intomainfrom
hcg/bill-res
Nov 17, 2025
Merged

Configure looser import spec for bills#56
hancush merged 2 commits intomainfrom
hcg/bill-res

Conversation

@hancush
Copy link
Collaborator

@hancush hancush commented Nov 11, 2025

Overview

This PR updates the bill resolution spec to use only identifier and from organization.

Closes #30, related to #52

Companion to opencivicdata/pupa#355

Demo

Run against failing bill on live site locally:

hannah@the-hankinator scrapers-lametro % docker compose run --rm -e DATABASE_URL=<PROD_DB_URL> scrapers pupa update lametro bills matter_ids=10740
[+] Building 0.0s (0/0)                                                                                                       docker:desktop-linux
[+] Creating 1/0
 ✔ Container scrapers-lametro-postgres  Running                                                                                               0.0s 
[+] Building 0.0s (0/0)                                                                                                       docker:desktop-linux
Operations to perform:
  Apply all migrations: contenttypes, core, councilmatic_core, legislative, pupa
Running migrations:
  No migrations to apply.
15 divisions found in the CSV, and 194969 already in the DB
The DB contains all CSV contents; no work to be done!
lametro (scrape, import)
  bills: {'matter_ids': '10740'}
Not checking sessions...
11/11/2025 15:31:42 INFO pupa: save jurisdiction Los Angeles County Metropolitan Transportation Authority as jurisdiction_ocd-jurisdiction-country:us-state:ca-county:los_angeles-transit_authority.json
11/11/2025 15:31:42 INFO pupa: save organization Board of Directors as organization_d0b4f5ec-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Mayor of the City of Los Angeles as post_d0b4f722-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Los Angeles County Board Supervisor, District 1 as post_d0b4f7c2-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Los Angeles County Board Supervisor, District 2 as post_d0b4f844-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Los Angeles County Board Supervisor, District 3 as post_d0b4f8bc-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Los Angeles County Board Supervisor, District 4 as post_d0b4f97a-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Los Angeles County Board Supervisor, District 5 as post_d0b4fa24-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of the Mayor of the City of Los Angeles as post_d0b4fa92-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of Governor of California as post_d0b4fb00-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post District 7 Director, California Department of Transportation (Caltrans), Appointee of the Governor of California as post_d0b4fb6e-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post District 7 Director (Interim), California Department of Transportation (Caltrans), Appointee of the Governor of California as post_d0b4fbe6-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of the Los Angeles County City Selection Committee, North County/San Fernando Valley sector as post_d0b4fc54-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of Los Angeles County City Selection Committee, Southwest Corridor sector as post_d0b4fcc2-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of Los Angeles County City Selection Committee, San Gabriel Valley sector as post_d0b4fd26-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Appointee of Los Angeles County City Selection Committee, South East Long Beach sector as post_d0b4fd8a-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Chair as post_d0b4fdee-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Vice Chair as post_d0b4fe5c-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post 1st Vice Chair as post_d0b4fec0-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post 2nd Vice Chair as post_d0b4ff24-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save post Chief Executive Officer as post_d0b4ffa6-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save organization Crenshaw Project Corporation as organization_d0b6d38a-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save organization LA SAFE as organization_d0b6f040-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO pupa: save organization Special Board Budget Workshop as organization_d0b7121e-bf45-11f0-bc60-0242ac130003.json
11/11/2025 15:31:42 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740'
11/11/2025 15:31:42 INFO scrapelib: HEAD - 'https://metro.legistar.com/gateway.aspx?m=l&id=10740'
11/11/2025 15:31:43 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/histories'
11/11/2025 15:31:44 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/sponsors'
11/11/2025 15:31:45 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/indexes'
11/11/2025 15:31:46 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/relations'
11/11/2025 15:31:47 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/attachments'
11/11/2025 15:31:48 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/versions'
11/11/2025 15:31:49 INFO scrapelib: GET - 'https://webapi.legistar.com/v1/metro/matters/10740/texts/12673'
11/11/2025 15:31:50 INFO pupa: save bill 2024-0885 in 2017 as bill_d17824c2-bf45-11f0-bc60-0242ac130003.json
import jurisdictions...
import organizations...
import people...
import posts...
import memberships...
import bills...
import events...
import vote events...
lametro (scrape, import)
  bills: {'matter_ids': '10740'}
bills scrape:
  duration:  0:00:08.136311
  objects:
    bill: 1
jurisdiction scrape:
  duration:  0:00:00.016529
  objects:
    jurisdiction: 1
    organization: 4
    post: 19
import:
  bill: 0 new 1 updated 0 noop
  jurisdiction: 0 new 0 updated 1 noop
  organization: 0 new 0 updated 4 noop
  post: 0 new 0 updated 19 noop

Testing Instructions

  • Pick another failing bill from Sentry and confirm you can rescrape it against the live site without error.

Copy link
Collaborator

@antidipyramid antidipyramid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

@hancush hancush merged commit f22865b into main Nov 17, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataImportError when scraping restricted bills

2 participants