Developing and Contributing to Ibis

For a primer on general open source contributions, see the pandas contribution guide. The project will be run much like pandas has been.

Linux Test Environment Setup

Conda Environment Setup

  1. Install the latest version of miniconda:

    # Download the miniconda bash installer
    curl -Ls -o $HOME/miniconda.sh \
        https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
    
    # Run the installer
    bash $HOME/miniconda.sh -b -p $HOME/miniconda
    
    # Put the conda command on your PATH
    export PATH="$HOME/miniconda/bin:$PATH"
    
  2. Install the development environment of your choice (Python 3.6 in this example), activate and install ibis in development mode:

    # Create a conda environment ready for ibis development
    conda env create --name ibis36 --file=ci/requirements_dev-3.6.yml
    
    # Activate the conda environment
    source activate ibis36
    
    # Install ibis
    python setup.py develop
    
  3. Install docker

  4. Download the test data:

    DATA_DIR=$PWD
    ci/datamgr.py download --directory=$DATA_DIR
    

Setting Up Test Databases

Impala (with UDFs)

  1. Start the Impala docker image in another terminal:

    # Keeping this running as long as you want to test ibis
    docker run --tty --rm --hostname impala cpcloud86/impala:java8
    
  2. Load data and UDFs into impala:

    test_data_admin.py load --data --data-dir=$DATA_DIR
    

Clickhouse

  1. Start the Clickhouse Server docker image in another terminal:

    # Keeping this running as long as you want to test ibis
    docker run --rm -p 9000:9000 --tty yandex/clickhouse-server
    
  2. Load data:

    ci/datamgr.py clickhouse \
        --database $IBIS_TEST_CLICKHOUSE_DB \
        --data-directory $DATA_DIR/ibis-testing-data \
        --script ci/clickhouse_load.sql \
        functional_alltypes batting diamonds awards_players
    

PostgreSQL

PostgreSQL can be used from either the installation that resides on the Impala docker image or from your machine directly.

Here’s how to load test data into PostgreSQL:

ci/datamgr.py postgres \
    --database $IBIS_TEST_POSTGRES_DB \
    --data-directory $DATA_DIR/ibis-testing-data \
    --script ci/postgresql_load.sql \
    functional_alltypes batting diamonds awards_players

SQLite

SQLite comes already installed on many systems. If you used the conda setup instructions above, then SQLite will be available in the conda environment.

ci/datamgr.py sqlite \
    --database $IBIS_TEST_SQLITE_DB_PATH \
    --data-directory $DATA_DIR/ibis-testing-data \
    --script ci/sqlite_load.sql \
    functional_alltypes batting diamonds awards_players

Running Tests

You are now ready to run the full ibis test suite:

pytest ibis

Contribution Ideas

Here’s a few ideas to think about outside of participating in the primary development roadmap:

  • Documentation
  • Use cases and IPython notebooks
  • Other SQL-based backends (Presto, Hive, Spark SQL)
  • S3 filesytem support
  • Integration with MLLib via PySpark