-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dataframe: add row appending methods and sql scan helpers. #50
Conversation
Some notes:
|
I would more time with but think it would be helpful to @sunker to get this merged soon then later if even means we change / fix some things in master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work; I enjoyed reviewing it a lot!
I have left some comments/thoughts but with my limited knowledge on the subject I think it's good to get merged.
dataframe/dataframe.go
Outdated
} | ||
} | ||
// second loop that modifies | ||
f.AppendRow(vals...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling AppendRow
and looping over vals
twice you could do the appending inside the above loop using:
f.Fields[i].Vector.Append(v)
func (v *nullablegenVector) Len() int { | ||
return len((*v)) | ||
} | ||
|
||
func (v *nullablegenVector) PrimitiveType() VectorPType { | ||
return vectorPType(v) | ||
} | ||
|
||
func (v *nullablegenVector) Extend(i int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the use case but maybe it would be useful to accept also an arbitrary number of values to initialise the new slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly it already does. Extend grows the vector be argument i.
dataframe/sql.go
Outdated
) | ||
|
||
// NewFromSQLRows returns a new dataframe populated with the data from rows. The Vector types | ||
// will be []*T if nullable or will be []*T if it is unknown if the column is nullable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typod - will fix
dataframe/sql.go
Outdated
seen := map[string]struct{}{} | ||
for _, name := range colNames { | ||
if _, ok := seen[name]; ok { | ||
return nil, nil, fmt.Errorf(`duplicate column names no allowed, found identical name: "%v"`, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index of the identical column could be included in the error
Alternatively, instead of having this restriction and returning an error I wonder if the new field name could be named after the column name and the column index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be solved at a deeper level. I think perhaps something in MarshalArrow takes care of it. I meant to make an issue and then link it in the comment and forgot, will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#57 (added to code comments as well).
// Nullabe types get passed to scan as a pointer to a pointer | ||
vec = reflect.MakeSlice(reflect.SliceOf(ptrType), 0, 0).Interface() | ||
} | ||
if !ValidVectorType(vec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder it this condition is enough; is it necessary for all the columns have a mapper? if yes then this condition should check also if a mapping exists? otherwise maybe here there should be an additional check for this?
As it is now I think it's possible to end up with a nil
mapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There would only be a nil mapper if you passed a nil mapper to NewFromSQLRows. If a mapper isn't provided, and it isn't a type natively supported by both dataframe and SQL scanner, it becomes a *string (via auto-generated mapper) which anything can be scanned into.
dataframe/sql.go
Outdated
|
||
// scannableRow adds a row to the dataframe, and returns a slice of references | ||
// that can be passed to rows.Scan() in the in sql package. | ||
func (f *Frame) scannableRow() []interface{} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer the name newScannableRow
that highlights the fact the a new row is created.
} | ||
} | ||
|
||
for fieldIdx, mapper := range mappers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had seen that discussion while researching this - but only at proposal stage :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// The dataframe's Fields and the Fields' Vectors must be initalized or AppendRow will panic. | ||
// The number of arguments must match the number of Fields in the Frame and each type must coorespond | ||
// to the Field type or AppendRow will panic. | ||
func (f *Frame) AppendRow(vals ...interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep this?
Why someone would prefer this over the AppendRowSafe
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speed. It should be faster to use if you already know that both your data and your frame will meet what is checked for, bypassing all the checks.
wip for #49