-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataStructure organization #169
Comments
@forki I found the Vector breaking change because it is in a project of mine I update every week, so it was no big deal for me to change code in my project. Any change I might make to Vector in its current form would be a breaking change for someone else. I really want to move forward on DataStructure work within a new framework, as has been discussed on several threads. I don't think you have published all of your ideas on this subject. I am willing to put together the source pull requests to set-up a new structure, but you know much more about the build and NuGet. On some other thread you mentioned FSharpx 1.7. Does it make sense to institute DS organizational changes in this release? |
Do you want to create a new nuget package "Fsharpx.Datastructures" or a complete new product? |
I would propose you create a new project in a Datastructure branch. I then try to create an additional build for a nuget pre-release. After a while we release it together with the JSON changes as 1.7. - I always wanted to create a pre-release build for fsharpx. |
I suggest the following:
I can update the Data Structure "rules" I posted earlier with feedback I have received and any feedback you want to give and post it to the Wiki. Today I don't think there are any collections or data structures fully ready for the new "finished" project. Ideally there would be some sort of voting for structures to graduate to this project. I can create a "roadmap" of finishing touches I think are required. Community members can "finish" collections in the SandBox, and eventually a copy of a finished collection gets pulled into the finished project. |
Ok. That sounds perfect. |
@dsyme @ovatsus @mausch @DanielFabian @7sharp9 @forki and anyone else interested: pending final input from you all, FSharpx 1.7 is moving forward with the following re-organization of collections/datastructures. Naming standards are important to get right from the beginning, and impossible to correct after the fact, so please make your thoughts known on namespace names or anything else about the proposal.
|
+1 - thanks for your work. |
Does that include adding the obsolete attribute? On 12 Dec 2012, at 17:34, jackfoxy notifications@github.com wrote:
|
@7sharp9 consider the obsolete attribute included (pending further comment). |
+1 |
+1 I don't think "persistent" works as a name (e.g. due to confusion with "persistence" in the sense of automatic storage of the collection to disk). Perhaps FSharpx.Collections or FSharpx.Collections.Immutable |
"Immutable" it is. "Functional" has also been suggested ... |
I don't favour "Functional" because, strictly speaking, these data structures uually have little to do with functional programming in the "lambda" and "function value" sense of the term - for example, they would make sense in a first-order programming environment where functions are not first-class values. |
"Immutable" it is, that makes sense. |
+1 Thanks for the work |
"Persistent" is used in clojure and scala world. We could link to: http://en.wikipedia.org/wiki/Persistent_data_structure I am personally not sure if the distinction makes that much sense, but the vector is not fully immutable. It is often constructed by a TransientVector which is mutable. After the first use it is flagged as read-only and allows only to create new vectors. |
Why don't we just leave them directly under FSharpx.Collections, and just use the sub-namespace for the mutables? |
yep that sounds like a good idea. |
Does it really worth emphasising that much mutable vs immutable? Wouldn't it be sufficient to put a comment for each collection? |
Actually it's a rather theoretical discussion. AFAIK atm we don't have
|
By visual inspection Collections/CircularBuffer and DataStructures/RingBuffer appear to be mutable to me. There has been discussion on other threads of adding HashSet from PowerPack, which is mutable. Has the number of files in the Collections folder shrunk? (I had not looked closely at it in the last 3 weeks or so.) It seems like it almost only contains extension methods now. Since LazyList is now an FSharpx collection/datastructure, does it not make sense to incorporate its module method "mapAccum" directly into its module instead of having it in Collections.fs? Also some "helper" methods datastructures/Infrastructure.fs. |
I think the HashSet in PowerPack is the same as the HashSet in System.Collections.Generic and is just there from the time it wasn't still included in the BCL |
@ovatsus You're right. I meant to refer to HashMultiMap in PP, which is mutable. |
That would be more interesting, but I would first rename it to MultiMap and do a little bit of interface cleanup to be more in line with the current Map in F#, at least removing the deprecated members |
@dsyme If you had to do it over again, would you distinguish mutable/immutable in the Microsoft.FSharp.Collections namespace? |
@ovatsus says "Why don't we just leave them directly under FSharpx.Collections, and just use the sub-namespace for the mutables?" I'm liking this idea more. The FSharpx.Core.Collections folder looks smaller and more organized than I remember it. Still all of the collections in it (LazyList, ByteString, CircularBuffer, and NonEmptyList) are in the FSharpx namespace, so changing them to FSharpx.Collections namespace (and in the case of CircularBuffer FSharpx.Collections.Mutable namespace) is a breaking change. |
I tend to like simpler solutions, I really don't like too many nested namespaces. About breaking changes, I honestly don't think we should worry too much about it and leave old deprecated versions, etc... I don't think there are that many people using them (I might be wrong), and it's an easy fix, you can always get the previous version on NuGet. FSharpx is in a relatively early stage, it's not an old mature library (I think the fact we're reorganizing stuff and discussing namespaces proves that), it's better to get things right at this phase than to maintain stability. |
I'm a minimalist too. I have to admit I had not looked at the collections folder lately, and my memory of it was of a more mixed-up state. Seems very organized now. If @forki agrees I will make namespace changes before the end of the weekend. Still planning to spin-off datastructures into a stand-alone project and Collections.SandBox namespace. |
yes please. |
Almost a non-breaking change, by adding AutoOpen of new namespaces to FSharpx.Core AssemblyInfo. But strangely some files in CSharpTests had to have using FSharpx.Collections added when previously they worked without using FSharpx; Something about C# interop I do not understand. |
Yes, I saw that. I'm planning on running some benchmarks. |
I must have accidently closed this issue. Appologies. I opened a discussion thread https://groups.google.com/forum/?hl=en&fromgroups#!topic/pragmatic-functional-programming-research/CtcBeLepCDkon on naming data structure functions. Please contribute to the discussion. |
@forki I've been benchmarking pre-release System.Collections.Immutable.Queue against the Queue I have ready for FSharpx.Collections. The one I have is just as fast (or slightly faster in some cases) and has more functions. |
Feel free to replace it. |
@jackfoxy Which queue were you benchmarking against? The mutable array On 3 January 2013 09:43, Steffen Forkmann notifications@github.com wrote:
|
@7sharp9 In total I benchmarked 10 queues, including the mutable System.Collections.Queue and System.Collections.Concurrent.Queue, which of course won almost every benchmark. But the interesting one was pre-release System.Collections.Immutable.Queue. I think a modified version of the immutable FSharpx.Datastructures.BatchedQueue is actually better. I will post it as FSharpx.Collections.Queue as soon as I resolve the function naming issue. I recently started a contract job which is taking a lot of my time, but I will publish the full benchmarks in a couple days. |
I've been meaning to write a more coherent exposition on this topic, here it finally is http://jackfoxy.com/semantics-and-list-like-data-structures |
@forki My proposal for the next step evolving FSharpx.Collections:
I can make both of these changes, but I still have not come up to speed with Fake or NuGet, so I realize this makes more work for you. What do you think? |
You can do it. I'll fix the build if needed. |
Time for interested people to let me know what they think: It occurred to me we can keep the Git history of Datastructure test and collection test files if I refer to them in their current folders from their new projects. However, if no one objects, I will create new folders for the new test projects (FSharpx.Collections.Tests and FSharpx.Collections.Experimental.Tests) and we will lose history for the test files involved. Let me know if you object to this, or have another idea of how to preserve the history. |
If by "losing history" you mean losing the ability to run a git blame on those files and get the history prior to this hypothetical move, then I don't really mind in this case. IMHO go ahead and do it... |
@mausch Yeah, that's what I mean by "losing history". |
Actually even this can be analysed (with some weird tricks) - but I think it's really no problem. |
@forki NuGet FSharpx.Collections.Experimental ships without FSharpx.Core. That may be by design. I can think of good reasons not to bundle Core.dll in the Experimental NuGet package. I really do not not which way is better...anyway, thanks for your work in completing the Datastructure re-organization! |
we can set up a dependency on the FSharpx.Core package if that helps?! |
@forki That's probably a good idea. |
Seems this is already the case see http://nuget.org/packages/FSharpx.Collections.Experimental/ |
...hmmm. Good old NuGet! There is no way to force NuGet to include Core in the Experimental package? I wouldn't worry about it too much. If you do install Experimental without Core the error message is pretty explicit about what needs fixing, and I assume if there were a version mis-match between Experimental and Core that would also be apparent. It is an "experimental" project, after all. |
It would be easy to include the Core.dll in the package but this might result in conflicts when some installs the FSharpx.Core.dll package. Actually I think the current solution is the best (modulo nuget bugs). |
Yeah, I agree. |
Just a quick query, is fsharpx portable profile friendly? On 29 Jan 2013, at 16:56, jackfoxy notifications@github.com wrote:
|
DataStructures and Collections have been re-organized into namespaces FSharpx.Collections, FSharpx.Collections.Mutable, and FSharpx.Collections.Experimental. This issue is complete. |
A timely recommendation for moving forward with FSharpx.DataStructures from Don Syme on this thread #165 (comment) . Jon Harrop recently wrote a lengthy response on this thread https://groups.google.com/forum/?hl=en&fromgroups=#!topic/pragmatic-functional-programming-research/K_yKAMDGJS0 to my article discussing DataStructure project organization http://jackfoxy.com/fsharpx-data-structures-discussion/
DataStructures has had an "R&D" quality, which is beneficial, so I suggest breaking DataStructures off into two projects or subprojects, FunctionalRandD and Functional. The first project would continue where the current DataStructures leaves off, but Functional would have more stringent requirements for entry and update. We could learn from failed experiments in R&D and only promote a single best-of-breed queue, heap, etc. to Functional.
I also suggest porting my project DS_Benchmark https://github.com/jackfoxy/DS_Benchmark to FunctionalRandD. It offers a consistent benchmarking infrastructure and scripting, so benchmarks can be run independently and verified. (For some reason the readme.md stopped displaying, but if you click the readme.md file it does display.) It has its architectural warts, but will benefit from other eyes making improvements.
The text was updated successfully, but these errors were encountered: