Retrieving Oracle table results from Python

A simple two liner to retrieve data results from Oracle.

# 1. Download and unzip 
#      Instant Client Package - Basic (64 bit or 32 bit same as your Python client) 
#      from http://www.oracle.com/technetwork/topics/winx64soft-089540.html
# 2. Add ..\instantclient_12_2 to environment variable "Path"
# 3. Install module cx_Oracle: 
#      py!pip install C:/cx_Oracle-6.0.2-cp36-cp36m-win_amd64.whl

import cx_Oracle as ora
import pandas as pd
conn = ora.connect("uname", "pwd", "host:port/sid")
dset = pd.read_sql("SELECT * FROM TABLE", conn)

Combining multiple files to one Excel file

This python code below allows you to choose multiple files (.csv, .xls, .xlsx) and create a multi-worksheet Excel file.


# -*- coding: utf-8 -*-

from tkinter import filedialog
from tkinter import *
from pathlib import Path
import pandas as pd

root = Tk()
infiles = filedialog.askopenfilenames(
                title = "Select the file you wish to convert", 
                filetypes = (("Excel files", "*.xls;*.xlsx"),("all files","*.*")))
counter = 1
if (len(infiles) > 0):
    p = Path(infiles[0])
    outfile = str(p.parent) + "\\newfile.xlsx"
    print(outfile)
    writer = pd.ExcelWriter(outfile, engine='xlsxwriter')

for infile in infiles:
    p = Path(infile)
    if (p.suffix == ".csv"):
        inp=pd.read_csv(infile, sep = ",")
    elif (p.suffix in [".xlsx", ".xls"]):
        inp=pd.read_excel(infile, dype = str)

    counter = counter + 1
    sheetname = p.name
    sheetname = sheetname.replace(p.suffix, "")
    inp.to_excel(writer, index = False, sheet_name = sheetname[0:30])

try:
    writer.save()
except NameError:
    print("No file selected")

Comparing Excel files with Python

A quick Python program to compare two excel files and create an output file.


from tkinter import filedialog
from tkinter import *
import pandas as pd

root = Tk()
df1=pd.read_excel(filedialog.askopenfilename(title = "Select the file you wish to compare", filetypes = (("Excel files", "*.xls;*.xlsx"), ("Text files", "*.txt;*.csv;*.tsv"),("all files","*.*"))))
df2=pd.read_excel(filedialog.askopenfilename(title = "Select the file with which you wish to compare", filetypes = (("Excel files", "*.xls;*.xlsx"), ("Text files", "*.txt;*.csv;*.tsv"),("all files","*.*"))))
df3=pd.merge(df1, df2)
writer = pd.ExcelWriter(filedialog.asksaveasfilename(title = "Select the folder where you wish to save the file", filetypes = (("Excel files", "*.xls;*.xlsx"), ("Text files", "*.txt;*.csv;*.tsv"),("all files","*.*"))), engine='xlsxwriter')
df3.to_excel(writer)
writer.save()

Suspicious Activity Report Statistics

A concise view of the SAR stats.


Tagged , , , ,

Converting CSV to Table rows in SQL Server and Oracle

Another useful block of code that will help in converting CSV to a table row.

with temp as
(
    select 108 Id, 'australia, australia, united states of america, iran' str
    UNION ALL
    select 109, 'india, iran, iraq'
    UNION ALL
    SELECT 109, 'israel, australia'
), xmlData as
(
     SELECT Id,  
         CAST ('<M>' + REPLACE(str, ',', '</M><M>') + '</M>' AS XML) AS Data  
     FROM  temp
)
SELECT xmlData.Id,  
     ltrim(rtrim(Split.a.value('.', 'VARCHAR(100)'))) AS Data  
FROM  xmlData CROSS APPLY Data.nodes ('/M') AS Split(a)

and in Oracle,

with temp as
(
    select 108 Id, 'australia, australia, united states of america, iran' str  from dual
    UNION ALL
    select 109, 'india, iran, iraq' from dual
    UNION ALL
    SELECT 109, 'israel, australia' FROM dual
)
SELECT DISTINCT ID, ltrim(regexp_substr(str, '[^' || ',' || ']+', 1, LEVEL)) str
FROM temp T
CONNECT BY regexp_substr(str, '[^' || ',' || ']+', 1, LEVEL) IS NOT NULL
ORDER BY ID;

Preparing NULL counts for SQL and Oracle data tables

I am posting this one after a very long time. Just thought this information will be useful for someone.

Here is the T-SQL for SQL Server that will fetch the columns in a table and provide you the NULL, ZERO, NEGATIVE and UNIQUE count of each table columns.

USE <database>
GO

BEGIN
	SET NOCOUNT ON
	DECLARE @TAB NVARCHAR(100) = <table_name>
	DECLARE @SQL NVARCHAR(500)
	DECLARE @RECORDS INT, @TOTRECORDS INT
	DECLARE @OBJECT_ID INT,
		@TOTAL INT
	DECLARE @COL NVARCHAR(100),
		@COLTYPE NVARCHAR(100),
		@TS NVARCHAR(100)
	SELECT @OBJECT_ID = OBJECT_ID FROM SYS.TABLES WHERE NAME = @TAB

	DECLARE col_cursor CURSOR FOR 
	SELECT	COLUMNS.NAME COLNAME, SYSTYPES.NAME COLTYPE
	FROM SYS.COLUMNS 
		INNER JOIN SYS.systypes ON SYSTEM_TYPE_ID = XUSERTYPE
		WHERE object_id = @OBJECT_ID;

	OPEN col_cursor
	FETCH NEXT FROM col_cursor INTO @COL, @COLTYPE

	set @sql = N'SELECT @ptotrecords = count(1) FROM ' + @TAB 
	EXECUTE sp_executeSQL
		@sql,
        N'@ptotrecords INT OUTPUT', @ptotrecords=@totrecords OUTPUT

    WHILE @@FETCH_STATUS = 0
    BEGIN
		set @sql = N'SELECT @precords = count(1) FROM ' + @TAB + ' WHERE ' + @COL + ' IS NULL'
		EXECUTE sp_executeSQL
			@sql,
            N'@precords INT OUTPUT', @precords=@records OUTPUT
		PRINT(convert(varchar, CURRENT_TIMESTAMP) + CHAR(9)+ @TAB + '.' + @COL + '[' + @COLTYPE + ']' + CHAR(9) + 'NULL' + CHAR(9) + convert(varchar, @records) + CHAR(9) + convert(varchar, @totrecords)+ CHAR(9) + convert(varchar, @records*1.00/@totrecords*100))

		set @sql = N'SELECT @precords = count(distinct ' + @COL + ') FROM ' + @TAB  + ' WHERE ' + @COL + ' IS NOT NULL'
		EXECUTE sp_executeSQL
			@sql,
            N'@precords INT OUTPUT', @precords=@records OUTPUT
		PRINT(convert(varchar, CURRENT_TIMESTAMP) + CHAR(9)+ @TAB + '.' + @COL + '[' + @COLTYPE + ']' + CHAR(9) + 'UNIQUE' + CHAR(9) + convert(varchar, @records) + CHAR(9) + convert(varchar, @totrecords)+ CHAR(9) + convert(varchar, @records*1.00/@totrecords*100))

		IF @COLTYPE IN ('MONEY', 'NUMERIC', 'INT') 
		BEGIN
			set @sql = N'SELECT @precords = count(1) FROM ' + @TAB + ' WHERE ' + @COL + ' = 0'
			EXECUTE sp_executeSQL
				@sql,
				N'@precords INT OUTPUT', @precords=@records OUTPUT
			PRINT(convert(varchar, CURRENT_TIMESTAMP) + CHAR(9)+ @TAB + '.' + @COL + '[' + @COLTYPE + ']' + CHAR(9) + 'ZERO' + CHAR(9) + convert(varchar, @records) + CHAR(9) + convert(varchar, @totrecords)+ CHAR(9) + convert(varchar, @records*1.00/@totrecords*100))
			set @sql = N'SELECT @precords = count(1) FROM ' + @TAB + ' WHERE ' + @COL + ' < 0'
			EXECUTE sp_executeSQL
				@sql,
				N'@precords INT OUTPUT', @precords=@records OUTPUT
			PRINT(convert(varchar, CURRENT_TIMESTAMP) + CHAR(9)+ @TAB + '.' + @COL + '[' + @COLTYPE + ']' + CHAR(9) + 'NEGATIVE' + CHAR(9) + convert(varchar, @records) + CHAR(9) + convert(varchar, @totrecords)+ CHAR(9) + convert(varchar, @records*1.00/@totrecords*100))
		END
		FETCH NEXT FROM col_cursor INTO @COL, @COLTYPE
    END
	CLOSE col_cursor;
	DEALLOCATE col_cursor;
END

PL/SQL code below does a similar job in Oracle.

SET SERVEROUTPUT ON SIZE 1000000
DECLARE
   L_ROW_COUNT                        PLS_INTEGER;
   L_TOTAL_COUNT                        PLS_INTEGER;
   COL  VARCHAR(200);
BEGIN
  FOR Y IN (SELECT * FROM ALL_TABLES WHERE TABLESPACE_NAME = '<table space name>' AND TABLE_NAME LIKE '<table name>%'  ORDER BY OWNER, TABLE_NAME ) LOOP
    EXECUTE IMMEDIATE    'SELECT COUNT(*)FROM ' || Y.OWNER || '.' || Y.TABLE_NAME INTO L_TOTAL_COUNT;
     FOR x IN (SELECT * FROM all_tab_columns WHERE table_name = Y.TABLE_NAME and owner = Y.OWNER ORDER BY  column_id ) LOOP
        COL := X.TABLE_NAME || '.' || X.COLUMN_NAME || '[' || X.DATA_TYPE || ']' || CHR(9) ;
  
        EXECUTE IMMEDIATE    'SELECT COUNT(*) FROM ' || X.OWNER || '.' || X.TABLE_NAME || ' WHERE ' || X.TABLE_NAME || '.' || X.COLUMN_NAME || ' IS NULL' INTO L_ROW_COUNT;
        DBMS_OUTPUT.PUT_LINE (COL || 'NULL' || CHR(9) || TO_CHAR (l_row_count, 'fm999,999,999,990') || CHR(9) || TO_CHAR (L_TOTAL_COUNT, 'fm999,999,999,990') || CHR(9) || TO_CHAR (L_ROW_COUNT*1.0/L_TOTAL_COUNT*100.0, 'fm999,999,999,990'));
  
        EXECUTE IMMEDIATE    'SELECT COUNT(DISTINCT ' || X.TABLE_NAME || '.' || X.COLUMN_NAME || ') FROM ' || X.OWNER || '.' || X.TABLE_NAME INTO L_ROW_COUNT;
        DBMS_OUTPUT.PUT_LINE (COL || 'UNIQUE' || CHR(9) || TO_CHAR (L_ROW_COUNT, 'fm999,999,999,990')|| CHR(9) || TO_CHAR (L_TOTAL_COUNT, 'fm999,999,999,990') || CHR(9) || TO_CHAR (L_ROW_COUNT*1.0/L_TOTAL_COUNT*100.0, 'fm999,999,999,990'));
        IF X.DATA_TYPE IN ( 'NUMBER') THEN
          EXECUTE IMMEDIATE    'SELECT COUNT(*) FROM ' || X.OWNER || '.' || X.TABLE_NAME || ' WHERE ' || X.TABLE_NAME || '.' || X.COLUMN_NAME || ' = 0' INTO L_ROW_COUNT;
          DBMS_OUTPUT.PUT_LINE (COL || 'ZERO' || CHR(9) || TO_CHAR (L_ROW_COUNT, 'fm999,999,999,990')|| CHR(9) || TO_CHAR (L_TOTAL_COUNT, 'fm999,999,999,990') || CHR(9) || TO_CHAR (L_ROW_COUNT*1.0/L_TOTAL_COUNT*100.0, 'fm999,999,999,990'));
          
          EXECUTE IMMEDIATE    'SELECT COUNT(*) FROM ' || X.OWNER || '.' || X.TABLE_NAME || ' WHERE ' || X.TABLE_NAME || '.' || X.COLUMN_NAME || ' < 0' INTO L_ROW_COUNT;
          DBMS_OUTPUT.PUT_LINE (COL || 'NEGATIVE' || CHR(9) || TO_CHAR (L_ROW_COUNT, 'fm999,999,999,990')|| CHR(9) || TO_CHAR (L_TOTAL_COUNT, 'fm999,999,999,990') || CHR(9) || TO_CHAR (L_ROW_COUNT*1.0/L_TOTAL_COUNT*100.0, 'fm999,999,999,990'));
        END IF;
     END LOOP;
  END LOOP;
END;

Extracting XML Element and Attribute values in Oracle

Let’s assume that you have a XML (xmltype) table column.

<?xml version="1.0" encoding="UTF-8"?>
  <form>
     <data k="a1">
        <value>row 1</value>
     </data>
     <data k="a2">
       <value>row 2</value>
     </data>
</form>

One way to easily extract data is given below

Select Xmltype(t.Col).Extract('//form/data[@k="a1"]/value/text()').Getstringval()  a1
From Mytable t

and reading an attribute value can be like this:

Select Xmltype(t.Col).Extract('//form/data[1]/@k').Getstringval()  k
From Mytable t

A Quick and Dirty Algorithm for fuzzy matching

The VBA program below is an easy-to-use Excel function that can be used to quickly check two strings through a fuzzy logic.  I created this program to locate some phonetic equivalents in Tamil.

Input 1 Input 2 Fuzzy 1 Fuzzy 2 Result
Madrass Madras mrs mrs Match
Rangoon Yangooon rnkn nkn Not a Match
Thiruvizhaa Thiruvila rvl rvl Match
Aan Azhagan Aanazhagan anlkn anlkn Match

Feel free to apply your language equivalents that you think are relevant.

Option Explicit

'FuzzyLogic
Function Cleanup(Text As String) As String
    Dim firstChar As String
    firstChar = Mid(Text, 1, 1)
    Text = Mid(Text, 2) 'keep the first char
    Text = LCase(Text) 'change the case
    Text = RemoveSplChars(Text) 'remove spl characters
    Text = RemoveVowels(Text) 'remove vowels
    Text = ReplaceLangEquivalents(Text) 'replace lang equivalents
    Text = RemoveRepeatingChars(Text) ' remove repeating  (or duplicate) chars
    Text = LCase(firstChar & Text) 'include the first char

    Text = ReplaceLangEquivalents(Text) 'replace lang equivalents once more

    Cleanup = Text
End Function

Private Function RemoveSplChars(Text As String) As String
    Text = Replace(Text, ".", "")
    Text = Replace(Text, " ", "")
    RemoveSplChars = Text
End Function

Private Function RemoveVowels(Text As String) As String
    Text = Replace(Text, "a", "", , , vbTextCompare)
    Text = Replace(Text, "e", "", , , vbTextCompare)
    Text = Replace(Text, "i", "", , , vbTextCompare)
    Text = Replace(Text, "o", "", , , vbTextCompare)
    Text = Replace(Text, "u", "", , , vbTextCompare)
    RemoveVowels = Text
End Function

Private Function RemoveRepeatingChars(Text As String) As String
    Dim i As Integer
x:
    'remove 2 chars
    For i = 1 To Len(Text) - 2
        If Mid(Text, i, 2) = Mid(Text, i + 2, 2) Then
            Text = Replace(Text, Mid(Text, i, 2) & Mid(Text, i + 2, 2), Mid(Text, i, 2), , , vbTextCompare)
            GoTo x:
        End If
    Next

y:
    'remove 1 chars
    For i = 1 To Len(Text) - 1
        If Mid(Text, i, 1) = Mid(Text, i + 1, 1) Then
            Text = Replace(Text, Mid(Text, i, 1) & Mid(Text, i + 1, 1), Mid(Text, i, 1), , , vbTextCompare)
            GoTo y:
        End If
    Next
    RemoveRepeatingChars = Text
End Function

Private Function ReplaceLangEquivalents(Text As String, Optional Lang As String = "ta") As String
    Select Case (Lang)
        Case "ta":

            Text = Replace(Text, "gh", "g", , , vbTextCompare)
            Text = Replace(Text, "th", "d", , , vbTextCompare)
            Text = Replace(Text, "dh", "d", , , vbTextCompare)
            Text = Replace(Text, "zh", "l", , , vbTextCompare)
            Text = Replace(Text, "sh", "s", , , vbTextCompare)
            Text = Replace(Text, "dr", "r", , , vbTextCompare)

            Text = Replace(Text, "g", "k", , , vbTextCompare) 'sakodhara, sagodhara
            Text = Replace(Text, "y", "", , , vbTextCompare) 'koyil, kovil, koil
    End Select
    ReplaceLangEquivalents = Text
End Function

Vinoth’s Wedding Video

Thought I will spend some time editing my brother’s long wedding video into a short seven minutes. Here is the final product.

Comparing Excel Worksheets using VBA

My wife asked me if I could help her with a macro that will compare two Excel Worksheets to generate a variance or a diff report. Here it is;

Sub CompareSheets(Sheet1 As Worksheet, Sheet2 As Worksheet, RangeToCompare As String, SkipFirstRow As Boolean)
    Dim i As Integer, j As Integer
    Dim val1, val2
    Dim ResultSheet As Worksheet
    Set ResultSheet = Application.Sheets.Add()
    ResultSheet.Name = "ResultSheet" & Format(DateTime.Date, "yyyymmdd") & "-" & Format(DateTime.Time, "hhmmss")
    For i = 1 To Sheet1.UsedRange.Rows.Count
        For j = 1 To Sheet1.UsedRange.Columns.Count
            If j = 1 Or (SkipFirstRow = True And i = 1) Then
                ResultSheet.Cells(i, j) = Sheet1.Cells(i, j)
            Else
                val1 = Sheet1.Cells(i, j)
                val2 = Application.WorksheetFunction.VLookup(Sheet1.Cells(i, 1), Sheet2.Range(RangeToCompare), j, False)
                ResultSheet.Cells(i, j) = (val1 = val2)
            End If

        Next j
    Next i
End Sub

This Sub-routine can be simply called like this:

Sub Test()
    CompareSheets Application.Sheets(2), Application.Sheets(3), "A:E", True
End Sub