Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some questions about KubeShare2.0 #24

Open
She-xj opened this issue Apr 25, 2023 · 3 comments
Open

some questions about KubeShare2.0 #24

She-xj opened this issue Apr 25, 2023 · 3 comments

Comments

@She-xj
Copy link

She-xj commented Apr 25, 2023

Hello!
I am installing the KubeShare2.0. I have finished the preparation and have output of kubectl describe node

Capacity:
  cpu:                16
  ephemeral-storage:  29352956Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16248988Ki
  nvidia.com/gpu:     1
  pods:               110
Allocatable:
  cpu:                16
  ephemeral-storage:  27051684205
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16146588Ki
  nvidia.com/gpu:     1
  pods:               110

When I follow deploy.md, I have some questions:

  1. Where to place kubeshare-config.yaml file? Could you please tell me its absolute path?
  2. I wonder how to "Make sure the enpoint of kubeshare-aggregator & kubeshare-collector of prometheus is up.".
  3. And also, could you please show me the config files of prometheus when monitoring kubernetes?

I am a beginner in this field so I will be so much grateful if you provide more details when building the KubeShare2.0 system.
Looking forward to your reply! Thanks a lot!

@icovej
Copy link

icovej commented May 1, 2023

Maybe I can answer some your questions.
The first is that you need place kubeshare-config.yaml under /kubeshare/scheduler
The second is that you can run prometheus to make sure their enpoint is up.

But I also have some questions about the GPU Topology. I have a cluster with only one node and two GPU. How can I write its kubeshare-config.yaml.
Thanks!

@She-xj
Copy link
Author

She-xj commented May 4, 2023

Maybe I can answer some your questions. The first is that you need place kubeshare-config.yaml under /kubeshare/scheduler The second is that you can run prometheus to make sure their enpoint is up.

But I also have some questions about the GPU Topology. I have a cluster with only one node and two GPU. How can I write its kubeshare-config.yaml. Thanks!

Thanks a lot for your answer! I think your problem is in the line childCellType: "NVIDIA GeForce GTX 3090", which should be written as childCellType: "NVIDIA-GeForce-GTX-3090"
About my first question, does the "kubeshare" match the whole project "KubeShare"? Or does I need to make new file named "kubeshare" in KubeShare?
Thanks!

@icovej
Copy link

icovej commented May 4, 2023

Maybe I can answer some your questions. The first is that you need place kubeshare-config.yaml under /kubeshare/scheduler The second is that you can run prometheus to make sure their enpoint is up.
But I also have some questions about the GPU Topology. I have a cluster with only one node and two GPU. How can I write its kubeshare-config.yaml. Thanks!

Thanks a lot for your answer! I think your problem is in the line childCellType: "NVIDIA GeForce GTX 3090", which should be written as childCellType: "NVIDIA-GeForce-GTX-3090" About my first question, does the "kubeshare" match the whole project "KubeShare"? Or does I need to make new file named "kubeshare" in KubeShare? Thanks!

Well, after I built KubeShare, in the root directory, KubeShare was there. In it, there are logs and other files. In fact, I also don't know if I need to place the project "KubeShare" in it, but I found I need to place the config.yaml there. Even though I rewrite its path in pkg/scheduler/scheduler.go, it still didn't work. So I think you can place the whole project "KubeShare" anywhere, but you need to modify some content

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants