Thanks Ben, See inline.
On Tue, Sep 10, 2019 at 1:09 AM Walker, Benjamin <benjamin.walker(a)intel.com>
wrote:
On Thu, 2019-09-05 at 08:40 +0300, Avner Taieb wrote:
> Hi Ben,
> Thanks for replying. I want the NVMe device to be able to DMA directly
from
> the network memory, also I don't want the network process to handle any
> events related to the nvme device , all nvme handling should be done in
the
> nvme process without any intervention of the network processes. Also
there
> few nvme processes and each one handles different device(s).
Does starting the network process as the primary and the NVMe process as
the
secondary work?
It works only if I made few changes in the spdk code. Because I don't want
the primary to handle any NVME events, I commented out some of the
initialisation of the NVME . In my changes the primary is initializes only
the memory zone for the nvme driver. In the NVME process (as secondary), I
changed the code so beside the memory it will initialise all PCI
related stuff.
I kind of defined new type of process NVME_PRIMARY, to separate the
different initialisation.
It did the work, but I don't feel comfortable with these changes, I am sure
I missed something.
In order to DMA directly, the processes must be sharing memory,
so you can't entirely isolate them. Can you outline what the goals of the
separation are, so I can help you find a solution that meets the
requirements?
You are absolutely right I'll try to explain better our needs:
1. our environment is Kubernetes , each process is running in different POD
2. We need one POD to be the networking processes, more than one POD to
serve as nvme driver, each for different nvme block device, and more than
one POD to serve as nvme target
3. We also want the different PODs to communicate through a very fast IPC,
we chose the spdk rings for this purpose.
4. To meet all the above the different process needs to have a shared
memory over the hugepages
Another option that cross my mind is to have separate hugepages for nvme
and for shared memory, launch all POD as primary. I didn't find an elegant
way of doing it, without massive changes in the spdk code. Is it possible
at all in the in the spkd to use different hugepages ?
Specifically, what does "exposed to the nvme devices" mean
in terms of
memory
and code?
What I meant here is that the network process shouldn't access the pci bus,
and shouldn't handle any events that relates to the nvme device.
Events like device unplug and new device plug in.
Non of its functionalities requires so, we need it to handle only
networking.
I'll appreciate your input and suggestion to meet our requirements and
still align with the SPDK architecture.
My goal is not to change anything in the SPDK and to control this behaviour
only by configuration/parameter, unless of-course it meets the roadmap of
the spkd , in this case we can contribute.
Another option is to not start the network process and the NVMe
processes
with
shared memory at all (pass them a different -i value). Then, open a Unix
domain
socket between the two and for the memory you do want to share, pass a file
descriptor over the socket that the other side can mmap. That's a more
complicated set up, but does allow you to share memory selectively.
>
> Thanks,
> Avner
>
> On Wed, Sep 4, 2019 at 8:50 PM Walker, Benjamin <
benjamin.walker(a)intel.com>
> wrote:
>
> > On Sun, 2019-08-25 at 20:20 +0300, Avner Taieb wrote:
> > > Hi All,
> > > We are using spdk in a multi processes environment, each process has
a
> > > different roll, however the processes are communicating using the
share
> > memory
> > > pool and ring (using the SPDK api).
> > > Because we are using the share memory functionality, we need to
> > designate one
> > > of the processes as primary and all the other as secondary.
> > > Here start our problems: We have two type of processes one type
> > interfaces an
> > > nvme device and the other type interfaces the network.
> > > We need the network process to serve as primary for the shared
memory,
> > but we
> > > don't want it to be exposed to the nvme devices.
> > > * How the two type of processes can be initialise to meet our
> > requirements ?
> > > * one solution that cross my mind, was to mount two hugepages
> > directories, one
> > > for shared memory and one for nvme driver, I couldn't find a way
doing
> > it.
> >
> > Are you planning to do a memory copy of all data from memory owned by
the
> > network process to memory owned by the storage process? How is that
> > hand-off
> > going to work? Or do you need the NVMe device to be able to DMA
directly
> > from
> > the network process memory?
> >
> > Thanks,
> > Ben
> >
> > > Thanks,
> > > Avner Taieb
> > > Reduxio
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > >
> > >
https://lists.01.org/mailman/listinfo/spdk
> > >
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> >
https://lists.01.org/mailman/listinfo/spdk
> >
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk