Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soma Detection #189

Open
neurologic opened this issue Dec 31, 2024 · 5 comments
Open

Soma Detection #189

neurologic opened this issue Dec 31, 2024 · 5 comments

Comments

@neurologic
Copy link

neurologic commented Dec 31, 2024

Hi,
Overall this pipeline is working great for me. But I am having an issue that I cannot solve with soma detection. Overall, igneous is achieving a skeletonization that looks good. However, when I export to swc via the following process,
vol = CloudVolume('precomputed://http://localhost:8001'); skel = vol.skeleton.get(1); with open(swc_outfile / f'{cell_folder}.swc', 'w') as swc_file: swc_file.write(skel.to_swc());
the node with the largest radius is not the root node (root has link -1, correct?). I can't find a way to check the root node(s) in the Skeleton.components() or Skeleton itself to see if it is just the way it is written to swc or if the soma is not being detected. Also, there are many nodes within the soma (within the expected invalidation sphere) so I was wondering if this is expected behavior, or if there should be a larger gap around the central soma node.

Here is what the mesh and skeleton look like in neuroglancer (igneuous view to a local port) around the soma (with 1-micron scale bar to judge size of soma against):

neuroglancer_screenshot_soma

Here is the lines of the exported swc around the soma nodes (node 2487 "should be" the soma and therefore root, correct?)

swc lines 2455 through 2517
2455 0 190272.000000 274752.000000 62700.000000 700.765320 2454
2456 0 190400.000000 274624.000000 62700.000000 756.962341 2455
2457 0 190272.000000 274496.000000 62700.000000 829.264709 2456
2458 0 190272.000000 274368.000000 62700.000000 913.857727 2457
2459 0 190272.000000 274240.000000 62700.000000 1007.650757 2458
2460 0 190144.000000 274112.000000 62700.000000 1083.402100 2459
2461 0 190016.000000 273984.000000 62700.000000 1156.546631 2460
2462 0 189888.000000 273856.000000 62820.000000 1258.316284 2461
2463 0 189760.000000 273728.000000 62820.000000 1382.403687 2462
2464 0 189632.000000 273600.000000 62820.000000 1488.279541 2463
2465 0 189504.000000 273472.000000 62940.000000 1584.262573 2464
2466 0 189376.000000 273344.000000 62940.000000 1674.753662 2465
2467 0 189248.000000 273216.000000 62940.000000 1774.503906 2466
2468 0 189120.000000 273088.000000 62940.000000 1870.443848 2467
2469 0 188992.000000 272960.000000 62940.000000 1956.077759 2468
2470 0 188864.000000 272832.000000 62940.000000 2054.131348 2469
2471 0 188736.000000 272704.000000 63060.000000 2149.381348 2470
2472 0 188608.000000 272576.000000 63060.000000 2256.411377 2471
2473 0 188480.000000 272448.000000 63180.000000 2317.308838 2472
2474 0 188352.000000 272320.000000 63180.000000 2394.660645 2473
2475 0 188224.000000 272192.000000 63300.000000 2476.281006 2474
2476 0 188096.000000 272064.000000 63300.000000 2582.315186 2475
2477 0 187968.000000 271936.000000 63420.000000 2680.477539 2476
2478 0 187840.000000 271808.000000 63420.000000 2808.900146 2477
2479 0 187712.000000 271680.000000 63540.000000 2863.776611 2478
2480 0 187584.000000 271552.000000 63540.000000 2960.821533 2479
2481 0 187584.000000 271424.000000 63540.000000 3027.849365 2480
2482 0 187584.000000 271296.000000 63540.000000 3094.749023 2481
2483 0 187456.000000 271168.000000 63540.000000 3144.193359 2482
2484 0 187328.000000 271040.000000 63660.000000 3243.948242 2483
2485 0 187200.000000 270912.000000 63660.000000 3329.470947 2484
2486 0 187200.000000 270784.000000 63660.000000 3400.903320 2485
2487 0 186944.000000 270400.000000 63780.000000 3408.262939 2486
2488 0 186816.000000 270272.000000 63900.000000 3374.454590 2487
2489 0 186688.000000 270144.000000 63900.000000 3273.407959 2488
2490 0 186688.000000 270016.000000 63900.000000 3174.306885 2489
2491 0 186816.000000 269888.000000 63900.000000 3077.339111 2490
2492 0 186944.000000 269760.000000 63900.000000 2982.712891 2491
2493 0 187072.000000 269632.000000 63900.000000 2870.038330 2492
2494 0 187200.000000 269504.000000 64020.000000 2785.390381 2493
2495 0 187328.000000 269376.000000 64020.000000 2707.086914 2494
2496 0 187328.000000 269248.000000 64020.000000 2631.124512 2495
2497 0 187456.000000 269120.000000 64020.000000 2543.949707 2496
2498 0 187456.000000 268992.000000 64020.000000 2462.129150 2497
2499 0 187456.000000 268864.000000 64020.000000 2384.375732 2498
2500 0 187456.000000 268736.000000 64020.000000 2308.495605 2499
2501 0 187584.000000 268608.000000 64020.000000 2206.898193 2500
2502 0 187584.000000 268480.000000 64140.000000 2133.241699 2501
2503 0 187712.000000 268352.000000 64140.000000 2047.015381 2502
2504 0 187840.000000 268224.000000 64140.000000 1971.638916 2503
2505 0 187840.000000 268096.000000 64140.000000 1916.864136 2504
2506 0 187968.000000 267968.000000 64020.000000 1841.477661 2505
2507 0 188096.000000 267840.000000 64020.000000 1750.666138 2506
2508 0 187968.000000 267712.000000 64140.000000 1683.044800 2507
2509 0 187968.000000 267584.000000 64140.000000 1577.947998 2508
2510 0 187968.000000 267456.000000 64140.000000 1464.874023 2509
2511 0 188096.000000 267328.000000 64260.000000 1379.739136 2510
2512 0 188096.000000 267200.000000 64260.000000 1273.985840 2511
2513 0 188096.000000 267072.000000 64380.000000 1170.627197 2512
2514 0 188224.000000 266944.000000 64380.000000 1085.909790 2513
2515 0 188224.000000 266816.000000 64380.000000 982.958801 2514
2516 0 188352.000000 266688.000000 64500.000000 923.610291 2515
2517 0 188480.000000 266560.000000 64500.000000 883.484009 2516

And here are the first few lines of the swc showing the root node of the skeleton (it is out at the edge of the segment on a dendritic process)

swc lines 1 through 12
1 0 162240.000000 310592.000000 43380.000000 120.000000 -1
2 0 162240.000000 310592.000000 43500.000000 128.000000 1
3 0 162240.000000 310592.000000 43620.000000 128.000000 2
4 0 162240.000000 310592.000000 43740.000000 120.000000 3
5 0 162368.000000 310720.000000 43860.000000 120.000000 4
6 0 162368.000000 310720.000000 43980.000000 128.000000 5
7 0 162368.000000 310848.000000 44100.000000 175.453705 6
8 0 162496.000000 310976.000000 44220.000000 128.000000 7
9 0 162624.000000 311104.000000 44340.000000 120.000000 8
10 0 162752.000000 311104.000000 44340.000000 120.000000 9
11 0 162880.000000 311104.000000 44460.000000 128.000000 10
12 0 163008.000000 311104.000000 44580.000000 217.181946 11

neuroglancer_screenshot_root

The resolution of the precomputed segmentation I am working with is 128,128,120 in x,y,z -- which is mip3 in VAST export to .raw files for the segmentation). (These skeletons are not for visualization so the low resolution is ok and easier for me to handle the whole process locally on my computer).

I have tried a wide variety of parameters for the soma accept/detect/scale/const. The most recent I have tried is igneous skeleton forge sg2_387415866_crude_mip3 --mip 0 --soma-detect 750 --soma-accept 1500 --soma-const 1000 --soma-scale 3 --const 20 --scale 10 --queue ./queue

Do you have any ideas about what I might be missing and/or what I can try?

Thank you again for your help!
-Krista

@william-silversmith
Copy link
Contributor

Hi Krista,

Thanks for the detailed report. Unfortunately, proper soma handling is something that has been on the table for years, but we never had a pressing need so it never got implemented (various lab members implemented a nucleus detector neural network and didn't need the skeletons for whatever reason).

The problem with somas is lack of visual context. If you use a really big task size, the soma looks much better, since it is contained entirely inside of one task. Another way to accomplish this is to use a lower resolution trace and then connect it to the higher resolution trace (this is a strategy I've been contemplating).

As for the SWC ordering, that is much simpler to solve, but is not implemented currently. I think it would just require the graph search to start from the highest radius node (or one of them if there are several). Currently it just uses the first node in the vertex list (which is usually the end of a branch), mainly because no one asked for anything more sophisticated. There is a complication in calling it a "soma" automatically (setting the vertex type) since that requires knowing the dataset includes somas and a definition of a soma radius.

The SWC reordering is something I could do in the near future.

@neurologic
Copy link
Author

William,
Thanks for the info. Not automatically detecting a root as a "soma" makes sense. Right now I am accomplishing the "soma" assignment and and subsequent swc ordering in a convoluted, but seemingly functional way using neuprint tools and excerpts from your CloudVolume code... (in case anyone else needing to do something similar before it is more elegantly implemented in your code). Neuprint has a "reorient skeleton" function to set a root node based on radius. Part of the reason I go through this process is to label different parts of the neuron, which I currently do with pandas dataframe and networkx.

from neuprint import skeleton as npskel  

# Write the Skeleton to a SWC file
with open(swc_outfile / f'{cell_folder}.swc', 'w') as swc_file:
    swc_file.write(skel.to_swc())

df = npskel.skeleton_swc_to_df(skelfile)  
df.loc[:,'node_type']=0  
npskel.reorient_skeleton(df,use_max_radius=True)
df.loc[df['link'].isin([-1]),'node_type'] = 1 # for my use case, I want this root node to be identified as soma

I am then using networkx DiGraph to help me label other parts of the neuron as dendrite, axon, etc (based on the logic Alex Shapson-Coe used in CREST to label subsets of a collection of segments as different cell components).

Then (this feels like the super convoluted part) I use bits of the CloudVolume code you wrote (like the renumber rows function) to get the dataframe (with node_labels column filled out) back into a correctly ordered swc file:

all_rows = []
# Iterate over each row
for index, r in df.iterrows():
    # Create list for the current row
    row_list =[int(r.rowId), int(r.node_type), r.x, r.y, r.z, r.radius, int(r.link)]
    # append the list to the final list
    all_rows.append(row_list)

all_rows = renumber(all_rows)

swc_header = f"""# ORIGINAL_SOURCE CloudVolume modified scripts 11.1.1
# CREATURE 
# REGION
# FIELD/LAYER
# TYPE
# CONTRIBUTOR kperks
# REFERENCE
# RAW 
# EXTRAS 
# SOMA_AREA
# SHINKAGE_CORRECTION 
# SCALE don't know how to get from dataframe yet 
"""

swc = swc_header + "\n"

swc += "\n".join((
  render_row(row)
  for row in all_rows
))


lines = swc.split("\n")

while len(lines) and (lines[0] == '' or re.match(r'[#\s]', lines[0][0])):
  l = lines.pop(0)

if len(lines) == 0:
  # return Skeleton()
    skel_fromswc = Skeleton()

vertices = []
edges = []
radii = []
vertex_types = []

label_index = {}

N = 0

for line in lines:
  if line.replace(r"\s", '') == '':
    continue
  (vid, vtype, x, y, z, radius, parent_id) = line.split(" ")
  
  coord = tuple([ float(_) for _ in (x,y,z) ])
  vid = int(vid)
  parent_id = int(parent_id)

  label_index[vid] = N

  if parent_id >= 0:
    if vid < parent_id:
      edge = [vid, parent_id]
    else:
      edge = [parent_id, vid]

    edges.append(edge)

  vertices.append(coord)
  vertex_types.append(int(vtype))

  try:
    radius = float(radius)
  except ValueError:
    radius = -1 # e.g. radius = NA or N/A

  radii.append(radius)

  N += 1

for edge in edges:
  edge[0] = label_index[edge[0]]
  edge[1] = label_index[edge[1]]

skel_fromswc = Skeleton(vertices, edges, radii, vertex_types)

with open(swc_outfile / f'{cell_folder}.swc', 'w') as swc_file:
    swc_file.write(skel_fromswc.to_swc())

@neurologic
Copy link
Author

William, I realized I have a follow-up question. If there is no "soma" detection, is the benefit of having the --soma-detect/accept/etc parameters just that it allows the algorithm to deal with cylinders of drastically different radius within one object? Without that, would there just be lots of skeleton branches within the "soma"?

@neurologic
Copy link
Author

I also just noticed that, even though I have re-assigned a "soma" node (largest radius node) as the root in my dataframe:
Screenshot 2025-01-01 at 11 31 29 AM

When I recreate a Skeleton from that and export to swc, the root node is reset to node 1:
1 3 162240.000000 310592.000000 43380.000000 120.000000 -1
2 3 162240.000000 310592.000000 43500.000000 128.000000 1
3 3 162240.000000 310592.000000 43620.000000 128.000000 2
4 3 162240.000000 310592.000000 43740.000000 120.000000 3
5 3 162368.000000 310720.000000 43860.000000 120.000000 4
6 3 162368.000000 310720.000000 43980.000000 128.000000 5
7 3 162368.000000 310848.000000 44100.000000 175.453705 6
8 3 162496.000000 310976.000000 44220.000000 128.000000 7
9 3 162624.000000 311104.000000 44340.000000 120.000000 8
10 3 162752.000000 311104.000000 44340.000000 120.000000 9
11 3 162880.000000 311104.000000 44460.000000 128.000000 10

I thought I had gotten around that, but I did not realize it was resetting.

@william-silversmith
Copy link
Contributor

william-silversmith commented Jan 2, 2025

William, I realized I have a follow-up question. If there is no "soma" detection, is the benefit of having the --soma-detect/accept/etc parameters just that it allows the algorithm to deal with cylinders of drastically different radius within one object? Without that, would there just be lots of skeleton branches within the "soma"?

That's exactly right! Usually we can discriminate somas pretty well in our datasets just using radius. There isn't anything more sophisticated going on than that.

I thought I had gotten around that, but I did not realize it was resetting.

You can check your edge list in the skeleton object. If the soma node is in position skel.edges[0,0] then I think it should work. It looks like you've been digging into my code, so you could try replacing the initial seed with something like (not tested) and see if it works more reliably:

stack = [ 
    skel.edges[np.where(skel.edges == np.argmax(skel.radii))][0,0] 
]

https://github.com/seung-lab/cloud-volume/blob/master/cloudvolume/skeleton.py#L1180

I'll probably give this a more serious look sometime in the near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants