Add a feature to handle setid for UCPath tables

Ideally I would like to avoid making a new macro specifically for setid'd UCPath tables.

#### (A general feature approach)
A feature could be added such that a where condition or conditions could be declared in the source.yml for each table and then surfaced in the generated model.
```yml
      - name: ps_jobcode_tbl
        description: jobcode validation table
        meta:
          effective_dated: true
          partition_columns: ["jobcode", "setid"]
          final_where_condition: ["setid in ('UCSHR')"]
```
The declared `final_where_condition` would surface in the final select statement of the staging model. For example,
```sql
-- Dummy Line

/*  The partition columns are: ['jobcode', 'setid']
*/

with
    source as (
        select * from {{ source(  'ucp', 'ps_jobcode_tbl' ) }}
        where
            _fivetran_deleted != true
            and dml_ind != 'D'

    ),

    valid_to as (
        -- Group the records so that each group belonging to one effdt gets the same valid_to date
        select
            jobcode,
            setid,
            effdt,
            coalesce((lag(to_date(effdt), 1) over (
                partition by
                    jobcode,
                    setid
                order by effdt desc
            ) - 1), '2099-12-31') as valid_to

        from source
        group by
            1,
            2,
            3
    ),

    final as (
        select
            source._fivetran_deleted,
            source._fivetran_id,
            source._fivetran_synced,
            source.avail_telework,
            source.barg_unit,
            source.can_noc_cd,
            source.comp_frequency,
            source.company,
            source.cr_bt_dtm,
            source.cr_bt_nbr,
            source.currency_cd,
            source.descr,
            source.descrshort,
            source.directly_tipped,
            source.dml_ind,
            source.eeo1code,
            source.eeo4code,
            source.eeo5code,
            source.eeo6code,
            source.eeo_job_group,
            source.eff_status,
            source.effdt,
            source.eg_academic_rank,
            source.eg_group,
            source.encumb_sal_amt,
            source.encumb_sal_optn,
            source.encumber_indc,
            source.flsa_status,
            source.function_cd,
            source.grade,
            source.ipedsscode,
            source.job_family,
            source.job_function,
            source.job_sub_func,
            source.jobcode,
            source.key_jobcode,
            source.last_update_date,
            source.lastupddttm,
            source.lastupdoprid,
            source.manager_level,
            source.med_chkup_req,
            source.ods_vrsn_nbr,
            source.posn_mgmt_indc,
            source.reg_region,
            source.reg_temp,
            source.retro_percent,
            source.retro_rate,
            source.sal_admin_plan,
            source.sal_range_currency,
            source.sal_range_freq,
            source.sal_range_max_rate,
            source.sal_range_mid_rate,
            source.sal_range_min_rate,
            source.setid,
            source.setid_salary,
            source.std_hours,
            source.std_hrs_frequency,
            source.step,
            source.survey_job_code,
            source.survey_salary,
            source.trn_program,
            source.union_cd,
            source.upd_bt_dtm,
            source.upd_bt_nbr,
            source.us_occ_cd,
            source.us_soc_cd,
            source.workers_comp_cd,

            -- New Objects 
            to_date(source.effdt) as valid_from,
            valid_to.valid_to as valid_to,
            case
                when valid_from > {{ var("current_date_pst") }} then 'future'
                when valid_to.valid_to < {{ var("current_date_pst") }} then 'past'
                when valid_to.valid_to >= {{ var("current_date_pst") }} then 'current'
            end as current_record_desc,
            case
                when current_record_desc = 'current' then true
                when current_record_desc in ('future', 'past') then false
            end as is_current_record

        from source

        left outer join valid_to
            on source.jobcode = valid_to.jobcode
                and source.setid = valid_to.setid
                and source.effdt = valid_to.effdt
    )

select * from final
where -- <-- THIS WHERE CONDITION WOULD BE VARIABLE AND ONLY POPULATE WHEN THE final_where_condition KEY IS POPULATED
    setid in ('UCHSR')

```

#### (A UCPath setid specific approach)
If we choose to make a setid specific UCPath macro, then the meta information declared can be more specific (naming wise).
```yml
      - name: ps_jobcode_tbl
        description: jobcode validation table
        meta:
          effective_dated: true
          partition_columns: ["jobcode", "setid"]
          ucsc_setid: ["UCSHR"]
```
We could also have more freedom in where the condition is set. For example we can apply the condition at the source CTE,
```sql
-- Dummy Line

/*  The partition columns are: ['jobcode', 'setid']
*/

with
    source as (
        select * from {{ source(  'ucp', 'ps_jobcode_tbl' ) }}
        where
            _fivetran_deleted != true
            and dml_ind != 'D'
            and setid in ('UCSHR') -- <-- THIS LINE WOULD NEED TO BE HARCODED IN THE MACRO. THE IN LIST VALUES WOULD BE THE ONLY VARIABLE PORTION.
    ),

    valid_to as (
        -- Group the records so that each group belonging to one effdt gets the same valid_to date
        select
            jobcode,
            setid,
            effdt,
            coalesce((lag(to_date(effdt), 1) over (
                partition by
                    jobcode,
                    setid
                order by effdt desc
            ) - 1), '2099-12-31') as valid_to

        from source
        group by
            1,
            2,
            3
    ),

    final as (
        select
            source._fivetran_deleted,
            source._fivetran_id,
            source._fivetran_synced,
            source.avail_telework,
            source.barg_unit,
            source.can_noc_cd,
            source.comp_frequency,
            source.company,
            source.cr_bt_dtm,
            source.cr_bt_nbr,
            source.currency_cd,
            source.descr,
            source.descrshort,
            source.directly_tipped,
            source.dml_ind,
            source.eeo1code,
            source.eeo4code,
            source.eeo5code,
            source.eeo6code,
            source.eeo_job_group,
            source.eff_status,
            source.effdt,
            source.eg_academic_rank,
            source.eg_group,
            source.encumb_sal_amt,
            source.encumb_sal_optn,
            source.encumber_indc,
            source.flsa_status,
            source.function_cd,
            source.grade,
            source.ipedsscode,
            source.job_family,
            source.job_function,
            source.job_sub_func,
            source.jobcode,
            source.key_jobcode,
            source.last_update_date,
            source.lastupddttm,
            source.lastupdoprid,
            source.manager_level,
            source.med_chkup_req,
            source.ods_vrsn_nbr,
            source.posn_mgmt_indc,
            source.reg_region,
            source.reg_temp,
            source.retro_percent,
            source.retro_rate,
            source.sal_admin_plan,
            source.sal_range_currency,
            source.sal_range_freq,
            source.sal_range_max_rate,
            source.sal_range_mid_rate,
            source.sal_range_min_rate,
            source.setid,
            source.setid_salary,
            source.std_hours,
            source.std_hrs_frequency,
            source.step,
            source.survey_job_code,
            source.survey_salary,
            source.trn_program,
            source.union_cd,
            source.upd_bt_dtm,
            source.upd_bt_nbr,
            source.us_occ_cd,
            source.us_soc_cd,
            source.workers_comp_cd,

            -- New Objects 
            to_date(source.effdt) as valid_from,
            valid_to.valid_to as valid_to,
            case
                when valid_from > {{ var("current_date_pst") }} then 'future'
                when valid_to.valid_to < {{ var("current_date_pst") }} then 'past'
                when valid_to.valid_to >= {{ var("current_date_pst") }} then 'current'
            end as current_record_desc,
            case
                when current_record_desc = 'current' then true
                when current_record_desc in ('future', 'past') then false
            end as is_current_record

        from source

        left outer join valid_to
            on source.jobcode = valid_to.jobcode
                and source.setid = valid_to.setid
                and source.effdt = valid_to.effdt
    )

select * from final
```
The reason `final_where_condition` is restricted to the final select statement is so devs have more freedom in what kind of conditions they can specify. Since it is going to be added to the final select statement then can apply conditions on fields that were added during the staging process.  

I prefer the `final_where_condition` method because this can be used in other models and not just for UCPath setid tables. 

For example for our effseq'd models we could `final_where_condition = ["setid in ('UCHSR
')", "is_max_effseq_of_effdt = true"]`.
```sql
.
.
.
select * from final
    where
        setid in ('UCHSR')
        and is_max_effseq_of_effdt = true
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a feature to handle setid for UCPath tables #1

(A general feature approach)

(A UCPath setid specific approach)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add a feature to handle setid for UCPath tables #1

Description

(A general feature approach)

(A UCPath setid specific approach)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions