Le Nguyen The Dat bio photo

Le Nguyen The Dat

~ Data Science & Engineering

Twitter Facebook LinkedIn Github
After a few hours without no luck trying to find the best way to dump Oracle DB's table into CSV with sqlplus, I've came up with my own solution. It might not be the best one, but at least it works for me:

SQL file query.sql:
  • SET COLSEP '|'
    SET ECHO OFF
    SET FEEDBACK OFF
    SET TERMOUT OFF
    SET PAGESIZE 32766 -- Maximum number we can set.
    -- i.e there will be duplication of headers.
    SET LINESIZE 32766 -- Maximum number we can set.
    SET NEWPAGE NONE
    SET VERIFY OFF
    SET TERM OFF
    SET TRIMS ON
    SET TRIMSPOOL ON
    SET UNDERLINE OFF

    SPOOL path/to/output_file.csv

    SELECT * FROM user.table;

    SPOOL OFF

    EXIT

Executing above SQL file:
  • $ export NLS_LANG=.AL32UTF8 # For character encoding
    $ export LD_LIBRARY_PATH=/usr/lib/oracle/12.1/client64/lib/ # Oracle Lib Home
    $ sqlplus64 -s username/password@"(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=oracle.server.ip.address)(PORT=port_number))(CONNECT_DATA=(SID=connection_string)))" @query.sql

Remove trailing spaces / tabs:
  • $ cat path/to/output_file.csv | awk '$1=$1' | sed -e 's/|\s\+/|/g' | sed -e 's/\s\+|/|/g' | awk '!a[$0]++' > path/to/final_output_file.csv
    # Explain:
    # awk '$1=$1': delete leading spaces
    # sed -e 's/|\s\+/|/g' | sed -e 's/\s\+|/|/g': remove spaces before and after delimiter "|"
    # awk '!a[$0]++': remove duplications

Done!

PS: As for comparison, this task would be so much easier with PostgreSQL...
  • psql -t -A -F'|' -h $PG_ENDPOINT -U $PG_USER -d $PG_DB -p $PG_PORT -c "SELECT * FROM schema.table;"