Implement coding block to do the features provided by python-pandas or R-DataFrame
We have no Pandas DataFrame in C#, there are commercial class libraries but not found anything open source.
We therefore need to re-code and substitute with a mix of Multidimensional Arrays, Jagged Arrays,  DataTables, and Dictionaries.
Many python functions can generate series or work on DataFrame slices, in C# this means nested loops or LINQ. Again it would be good to hide this all on a DataFrame class or as extension functions.
To do a nice clean implementation in future we may need to refactor and create some sort of DataFrame equivalent.


wrote Jan 7, 2013 at 10:36 PM

Resolved with changeset 23409: Briefly started the job of abstracting away the DataTable

mashi wrote Jan 13, 2013 at 2:41 AM

** Closed by mashi 1/7/2013 3:36 PM

dmarsh26 wrote Jan 13, 2013 at 2:41 AM

I have attempted to start a DataFrame class but it is not finished or used in the EventProfiler.

The task to remove the DataTable is complete but there remains a design question.

Do we hand craft C# for max performance or do we try to create a high level abstraction like the R/Pandas DataFrame.

The initial conversion from Python was complicated because we had no DataFrame, having one would make porting easier maybe but the resulting code may not perform well.

wrote Jan 15, 2013 at 3:49 AM

wrote Feb 14, 2013 at 12:46 AM