I ran into an issue today and I have been stumped for some time in trying to get the results I am searching for.
I currently have a class that resembles the following:
public class InstanceInformation
{
public string PatientID {get; set;}
public string StudyID {get; set;}
public string SeriesID {get; set;}
public string InstanceID {get; set;}
}
I have a List<InstanceInformation>
and I am trying to use LINQ (or whatever other means to generate paths (for a file-directory) based on this list that resemble the following:
PatientID/StudyID/SeriesID/InstanceID
My issue is the data is currently unstructured as it comes in the previously mentioned form (List) and I need a way to group all of the data with the following constraints:
- Group InstanceIDs by SeriesID
- Group SeriesIDs by StudyID
- Group StudyIDs by PatientID
I currently have something that resembles this:
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from studyGroups in
(from instance in patientGroups
group instance by instance.StudyID)
from seriesGroup in
(from instance in studyGroups
group instance by instance.SeriesID)
from instanceGroup in
(from instance in seriesGroup
group instance by instance.InstanceID)
group instanceGroup by patientGroups.Key;
which just groups all of my InstanceIDs by PatientID, and it's quite hard to cull through all of the data after this massive grouping to see if the areas in between (StudyID/SeriesID) are being lost. Any other methods of solving this issue would be more than welcome.
This is primarily just for grouping the objects - as I would need 开发者_JAVA技巧to then iterate through them (using a foreach)
I have no idea if the query you've come up with is the query you actually want or need, but assuming that it is, let's consider the question of whether there is a better way to write it.
The place you want to look is section 7.16.2.1 of the C# 4 specification, a portion of which I quote here for your convenience:
A query expression with a continuation
from ... into x ...
is translated into
from x in ( from ... ) ...
Is that clear? Let's take a look at a fragment of your query that I've marked with stars:
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from studyGroups in
**** (from instance in patientGroups
group instance by instance.StudyID) ****
from seriesGroup in
(from instance in studyGroups
group instance by instance.SeriesID)
from instanceGroup in
(from instance in seriesGroup
group instance by instance.InstanceID)
group instanceGroup by patientGroups.Key;
Here we have
from studyGroups in ( from ... ) ...
the spec says that this is equivalent to
from ... into studyGroups ...
so we can rewrite your query as
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from instance in patientGroups
group instance by instance.StudyID into studyGroups
from seriesGroup in
**** (from instance in studyGroups
group instance by instance.SeriesID) ****
from instanceGroup in
(from instance in seriesGroup
group instance by instance.InstanceID)
group instanceGroup by patientGroups.Key;
Do it again. Now we have
from seriesGroup in (from ... ) ...
and the spec says that this is the same as
from ... into seriesGroup ...
so rewrite it like that:
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from instance in patientGroups
group instance by instance.StudyID into studyGroups
from instance in studyGroups
group instance by instance.SeriesID into seriesGroup
from instanceGroup in
**** (from instance in seriesGroup
group instance by instance.InstanceID) ****
group instanceGroup by patientGroups.Key;
And again!
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from instance in patientGroups
group instance by instance.StudyID into studyGroups
from instance in studyGroups
group instance by instance.SeriesID into seriesGroup
from instance in seriesGroup
group instance by instance.InstanceID into instanceGroup
group instanceGroup by patientGroups.Key;
Which I hope you agree is a whole lot easier to read. I would improve its readability more by changing the fact that "instance" is used half a dozen times to mean different things:
var groups = from instance in instances
group instance by instance.PatientID into patientGroups
from patientGroup in patientGroups
group patientGroup by instance.StudyID into studyGroups
from studyGroup in studyGroups
group studyGroup by studyGroup.SeriesID into seriesGroups
from seriesGroup in seriesGroups
group seriesGroup by seriesGroup.InstanceID into instanceGroup
group instanceGroup by patientGroups.Key;
Whether this is actually the query you need to solve your problem, I don't know, but at least this one you can reason about without turning yourself inside out trying to follow all the nesting.
This technique is called "query continuation". Basically the idea is that the continuation introduces a new range variable over the query so far.
I think this will yield what you're looking for:
public class InstanceInformation {
public string PatientID { get; set; }
public string StudyID { get; set; }
public string SeriesID { get; set; }
public string InstanceID { get; set; }
public override string ToString() {
return String.Format("Series = {0} Study = {1} Patient = {2}", SeriesID, StudyID, PatientID);
}
}
class Program {
static void Main(string[] args) {
List<InstanceInformation> infos = new List<InstanceInformation>() {
new InstanceInformation(){ SeriesID = "A", StudyID = "A1", PatientID = "P1" },
new InstanceInformation(){ SeriesID = "A", StudyID = "A1", PatientID = "P1" },
new InstanceInformation(){ SeriesID = "A", StudyID = "A1", PatientID = "P2" },
new InstanceInformation(){ SeriesID = "A", StudyID = "A2", PatientID = "P1" },
new InstanceInformation(){ SeriesID = "B", StudyID = "B1", PatientID = "P1"},
new InstanceInformation(){ SeriesID = "B", StudyID = "B1", PatientID = "P1"},
};
IEnumerable<IGrouping<string, InstanceInformation>> bySeries = infos.GroupBy(g => g.SeriesID);
IEnumerable<IGrouping<string, InstanceInformation>> byStudy = bySeries.SelectMany(g => g.GroupBy(g_inner => g_inner.StudyID));
IEnumerable<IGrouping<string, InstanceInformation>> byPatient = byStudy.SelectMany(g => g.GroupBy(g_inner => g_inner.PatientID));
foreach (IGrouping<string, InstanceInformation> group in byPatient) {
Console.WriteLine(group.Key);
foreach(InstanceInformation II in group)
Console.WriteLine(" " + II.ToString());
}
}
In you class override the tostring method; like below.
public class InstanceInformation
{
public string PatientID { get; set; } public string StudyID { get; set; } public string SeriesID { get; set; } public string InstanceID { get; set; }
public override string ToString()
{
var r = string.Format("{0}/{1}/{2}/{3}", PatientID, StudyID, SeriesID, InstanceID);
return r;
}
}
var listofstring = list.ConvertAll<string>(x => x.ToString()).ToList();
var listofstringdistinct = listofstring.Distinct().ToList();
This is easier to read and understand.
Don't know exacly what you need, but this (very long code) will return a dictionary (of dictionaries...) grouped as you said (i.e. PatientID/StudyID/SeriesID/InstanceID
):
var byPatient = new Dictionary<string, Dictionary<string, Dictionary<string, Dictionary<string, InstanceInformation>>>>();
foreach (var patientGroup in instances.GroupBy(x => x.PatientID))
{
var byStudy = new Dictionary<string, Dictionary<string, Dictionary<string, InstanceInformation>>>();
byPatient.Add(patientGroup.Key, byStudy);
foreach (var studyGroup in patientGroup.GroupBy(x => x.StudyID))
{
var bySeries = new Dictionary<string, Dictionary<string, InstanceInformation>>();
byStudy.Add(studyGroup.Key, bySeries);
foreach (var seriesIdGroup in studyGroup.GroupBy(x => x.SeriesID))
{
var byInstance = new Dictionary<string, InstanceInformation>();
bySeries.Add(seriesIdGroup.Key, byInstance);
foreach (var inst in seriesIdGroup)
{
byInstance.Add(inst.InstanceID, inst);
}
}
}
}
P.S.
I've considered InstanceID
as unique among all instances.
Otherwise, the last dictionary level should be: Dictionary<string, List<InstanceInformation>>
EDIT:
Reading your last comment, I think you don't need a real GroupBy
, but rather an OrderBy().ThenBy()...
foreach (var el in instances.OrderBy(x => x.PatientID)
.ThenBy(x => x.StudyID)
.ThenBy(x => x.SeriesID)
.ThenBy(x => x.InstanceID))
{
// it yields:
// Pat1 Std1 Srs1 Inst1
// Pat1 Std1 Srs1 Inst2
// Pat1 Std1 Srs2 Inst1
// Pat1 Std2 Srs2 Inst2
// ...
}
The following Linq statement in query syntax should solve your problem.
var groups = from instance in instances
group instance by instance.PatientGuid into patientGroups
select new
{
patientGroups.Key,
StudyGroups = from instance in patientGroups
group instance by instance.StudyGuid into studyGroups
select new
{
studyGroups.Key,
SeriesGroups = from c in studyGroups
group c by c.SeriesGuid into seriesGroups
select seriesGroups
}
};
You can then iterate your groups with the following set of nested foreach loops on the groups. This will allow you to create your directory tree efficiently and do any other operations at each level.
foreach (var patientGroups in groups)
{
Console.WriteLine("Patient Level = {0}", patientGroups.Key);
foreach (var studyGroups in patientGroups.StudyGroups)
{
Console.WriteLine("Study Level = {0}", studyGroups.Key);
foreach (var seriesGroups in studyGroups.SeriesGroups)
{
Console.WriteLine("Series Level = {0}", seriesGroups.Key);
foreach (var instance in seriesGroups)
{
Console.WriteLine("Instance Level = {0}", instance.InstanceGuid);
}
}
}
}
This is a proof of concept, but initial testing shows that it works properly. Any comments would be appreciated.
Eric Lippert perfectly explained how you can avoid the horrible nesting and write just a single flat query using "query continuation" (the into
keyword).
I think you can do one more step and write it directly using the GroupBy
method. Sometimes, using the LINQ methods directly gives you clearer code and I think this is one such example:
var groups = instances.
GroupBy(instance => instance.PatientID).
GroupBy(patientGroup => patientGroup.StudyID).
GroupBy(studyGroup => studyGroup.SeriesID).
GroupBy(seriesGroup => seriesGroup.InstanceID).
GroupBy(instanceGroup => patientGroups.Key);
(I don't really know if this is what you're looking for - I just did a "syntactic transformation" of what Eric wrote - and I believe I didn't change the meaning of Eric's query)
EDIT There may be some trickery with the last group by
, because it is not completely regular.
精彩评论