Script to knock off positive and negative values from a data set

11 posts / 0 new
Last post
aliasgar.sk
Offline
Joined: 08/04/2019 - 05:40
Script to knock off positive and negative values from a data set

Hi Brian,
Hope you're doing good. Am looking for a Idea script which can be used to knock off positive and negative values from a data set taking into considerations the GL codes as well.
Currently, I am able to perform this in an excel file by using the below mentioned formula,
“=(COUNTIFS($B$2:B2,B2,$A$2:A2,A2)<=MIN(COUNTIFS($B$2:$B$1000000,-B2,$A$2:$A$1000000,A2),COUNTIFS($B$2:$B$1000000,B2,$A$2:$A$1000000,A2)))”
Just Replace the “A” in the formula with the column for “GL code” and “B” in the formula with the column for  “Net amount”
But, the above mentioned formula works only for data which is small (3000-5000 line items in excel)
I am looking to replicate the above mentioned formula in Idea, so we can replicate the same for huge amount of data as well. This would be quite useful to eliminate provision entries passed by the management.
Looking forward to your comments on this.
PS: Have attached a illustration in excel for your reference

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi aliasgar.sk,

That sounds interesting.  So if the GL only has one net amount (but could have multplie transactions with the same amount) it would be a False.  If the GL has two or more different net amounts then the minimum and maximum would be removed.  Is that correct?

Brian

aliasgar.sk
Offline
Joined: 08/04/2019 - 05:40

Hi Brian,
Thank you for your prompt reply.
So, the basic objective is to knock-off postive and negative amounts against the same GL code.
Attaching another excel file for reference

 
 
 

True = items which have got knocked off (sum total is Nil) against same GL code
Fales = items which have not got knocked off against same GL code
Hope this clears.
Thanks again,
Ali
 

Files: 
Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Thanks for the clarrification.  The only way I can think of doing this is through a script.

aliasgar.sk
Offline
Joined: 08/04/2019 - 05:40

Yes Brian, a script would be really helpful!
Will it be possible for you to develop a script for this task?

aliasgar.sk
Offline
Joined: 08/04/2019 - 05:40

This might probably be helpful, a colleague of mine broke down the excel formula and recreated the process in Idea but the only limitation being amounts were not knocked off against the same GL.
The steps carried out our mentioned below,
Step 1: Run the Count Script on the Net Amount Field. Rename the new field created as "Cumulative_count" 
Step 2: Append a Field with the opposite sign of Net Amount field. Call this field as "Amount Negative"
Step 3: Summarize the database by Net Amount field. Don’t total any fields. Don't use quick summarization!
Step 4: Join the Main database and the summarized database. Main database is primary (join all the fields) and summarized database is secondry (join only "No_of_rec" field). Match by "Net Amount". Check the Box for "All record in Primary field". Call this database as "Joined database 1". Rename the "No_of_Rec" field as "Count_P"
Step 5: Now join "Joined database 1" and summarized database. Joined database 1 is primary (Join all the fields) and summarized database is secondary (join only "No_of_Rec" field). Match by "Amount Negative". Check the Box for "All record in Primary field". Call this database as "joined database 2".  Rename the "No_of_Rec" field as "Count_N"
Step 6: Append a field to "Joined database 2" with the formula @Min(Count_P, Count_N). Name this field as Min
Step 7: Append a field to "Joined database 2" with the fomula @if(Cumulative_count<=Min,1,0). Name this field as Sanitize
Step 8: Extract all rows with "Sanitize" field as "1". These are the rows that can be deleted. The control total should be 0
Step 9: Extract all rows with "Sanitize" field as "0". These are the rows that can be retained. The control total should be same as that of the main database
As you can see this is quite a long process! :(

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

I was trying to figure this out in IDEAScript today and it is a bit too complicated with IDEAScript because of the limited arrays.  So I could do it for a small file but it would fail for a large file.  Which version of IDEA do you have as Python script should be able to handle this as it has better array handling.

aliasgar.sk
Offline
Joined: 08/04/2019 - 05:40

Hi Brian, thank you for taking out the time for this! 
I have the 10.3 version of Idea 

klmi
Offline
Joined: 02/13/2019 - 08:41

Although I like the easy way to work with arrays in Python Brian suggested, as far as I understood your case, you only have 1 to 1 relations which should be matched by GL_CODE and NET_AMOUNT. That sounds as a relative easy job for IDEA. But with 1 to n or m to n relations my solution would not be possible! One client was using the software ADRA MATCH which can do such reconcilation jobs.

However my output differs from yours finally you will get the same result. Following I have copied my history and added some comments:

' Import Excel
Set task = Client.GetImportTask("ImportExcel")
dbName = "[...]\Illustration.xlsx"
task.FileToImport = dbName
task.SheetToImport = "03"
task.OutputFilePrefix = "Illustration"
task.FirstRowIsFieldName = "TRUE"
task.EmptyNumericFieldAsZero = "FALSE"
task.PerformTask
dbName = task.OutputFilePath("03")
Set task = Nothing
Client.OpenDatabase(dbName)

' Extract debit bookings
Set db = Client.OpenDatabase("Illustration-03.IMD")
Set task = db.Extraction
task.IncludeAllFields
dbName = "debit.IMD"
task.AddExtraction dbName, "", " NET_AMOUNT >0"
task.CreateVirtualDatabase = False
task.PerformTask 1, db.Count
Set task = Nothing
Set db = Nothing
Client.OpenDatabase (dbName)

' Index for matching / joining
Set db = Client.OpenDatabase("debit.IMD")
Set task = db.Index
task.AddKey "NET_AMOUNT", "A"
task.AddKey "GL_CODE", "A"
task.Index FALSE
Set task = Nothing
Set db = Nothing

' Extract credit bookings
Set db = Client.OpenDatabase("Illustration-03.IMD")
Set task = db.Extraction
task.IncludeAllFields
dbName = "credit.IMD"
task.AddExtraction dbName, "", " NET_AMOUNT <0"
task.CreateVirtualDatabase = False
task.PerformTask 1, db.Count
Set task = Nothing
Set db = Nothing
Client.OpenDatabase (dbName)

' Append new field with opposite amounts to credit bookings
Set db = Client.OpenDatabase("credit.IMD")
Set task = db.TableManagement
Set field = db.TableDef.NewField
field.Name = "MATCH_AMOUNT"
field.Description = ""
field.Type = WI_VIRT_NUM
field.Equation = "NET_AMOUNT*-1"
field.Decimals = 0
task.AppendField field
task.PerformTask
Set task = Nothing
Set db = Nothing
Set field = Nothing

' Index
Set db = Client.OpenDatabase("credit.IMD")
Set task = db.Index
task.AddKey "MATCH_AMOUNT", "A"
task.Index FALSE
Set task = Nothing
Set db = Nothing

' Match credit and debit amounts
Set db = Client.OpenDatabase("credit.IMD")
Set task = db.JoinDatabase
task.FileToJoin "debit.IMD"
task.IncludeAllPFields
task.IncludeAllSFields
task.AddMatchKey "MATCH_AMOUNT", "NET_AMOUNT", "A"
task.AddMatchKey "GL_CODE", "GL_CODE", "A"
task.CreateVirtualDatabase = False
dbName = "match_debit_credit1.IMD"
task.PerformTask dbName, "", WI_JOIN_ALL_REC
Set task = Nothing
Set db = Nothing
Client.OpenDatabase (dbName)

Last thing you have to do is comparing MATCH_AMOUNT and NET_AMOUNT. IF they are equal the bookings are matching (1 or true in your Excel file).

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

If it was a 1 to 1 match it would be simple but my understanding is it can be a many to many match but you can only match one at a time.  The example only has 1 to 1 but I think you could also have the following scenarios:

GL      Net Amount     Match

10001                 1     True

10001                 1     True

10001                 2     False

10001               -1     True

10001               -1     True

klmi
Offline
Joined: 02/13/2019 - 08:41

The problem that I see with the given information (f.e. no voucher number) is that we can have more than one (valid?) solution if 1 to n / m to n relations are allowed.