GitHub - ddsh/bigquery-rank: Sorting framework for large BigQuery tables

BigQuery Rank

Principle

Computes a new table in bigquery with the rank with respect to
a specified field.
1. Extract the bigquery src_table to storage in a specified bucket
2. Download the files in /tmp
3. Uncompress files - sort - rank
4. Upload ranked file to bigquery

Arguments

project : Google Cloud Platform Project Id
bucket : Storage Bucket Id
dataset : BigQuery Dataset Id
src_table : BigQuery Source Table
dst_table : BigQuery Destination Table
field : Field of the source table for sorting
reverse : Reverse Sorting
numerical : Numerical Sorting

Usage

  bigquery_rank.py <project> <bucket> <dataset> <src_table> <dst_table> <field> [--reverse] [--numerical]
  bigquery_rank.py -h | --help

Example

Input table:

id	historical_num_purchases
'Alice'	4
'Bob'	3
'Charlie'	5

Output table:

id	historical_num_purchases	rank
'Alice'	4	2
'Bob'	3	3
'Charlie'	5	1

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
bigquery_rank.py		bigquery_rank.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BigQuery Rank

Principle

Arguments

Usage

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BigQuery Rank

Principle

Arguments

Usage

Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages