I am new to database programming and want some tips on performance / best practices. I am parsing some websites to scrap television episode infos and placing them into an MS SQL 2008 R2 relational database.
Lets say i have a table filled with type Episode. When i start a new parsing, i generate a new list of Episodes. The thing is, i want the database to match exactly as the new list. Currently I'm doing a mass delete all then insert all. Problem is I'm not sure if this is the best way to go about it, especially since i'm concerned about data persistancy (episode_id primary indexes staying the same for long periods of time).
Is there some easy way to insert any new episodes into the table, update any ones that have changed, and delete any that no longer exist anymore, such that the end result is exactly the same as the new list of episodes. An episode would be compared by the series id, season number, and episode number.
Edit:
A Series type contains a list of multiple different episode types, for instance:
List<TVDBEpisode>
List<TVRageEpisode>
List<TVcomEpisode>
I would parse a single site at a time for instance:
public void ParseTVDB(Series ser)
{
var eps = new List<TVDBEpisode>();
//... Parse tvdb and add each epsiode to this list
//... Make the Series' existing TVDBEpisodes match the new TVDBEpisodes
}
public void ParseTVRage(Series ser)
{
var eps = new List<TVRageEpisode>();
//... Parse tvrage and add each epsiode to this list
//... Make the Series' existing TVRageEpisodes match the new TVRageEpisodes
}
public void ParseTVcom(Series ser)
{
开发者_如何学C var eps = new List<TVcomEpisode>();
//... Parse tvcom and add each epsiode to this list
//... Make the Series' existing TVcomEpisodes match the new TVcomEpisodes
}
Yes, have a look at the AttachAllOnSubmit(), InsertAllOnSubmit(), and DeleteAllOnSubmit(). The methods are built into LINQ to SQL and will executive the apprpriate operations that you're referring to above. Here some example code.
I ended up using a drastically different approach all-together, but it got the job done. Figure i would post it here in case anyone has a similar problem as is curious how i solved it. I am not too familiar with SQL specifics since i am new to databases, but i am pretty good with LINQ from normal object querying, so i decided to go an all LINQ approach.
Change the partial class to include a variable to check whether the item was found this time around:
public partial class TVDBEpisode
{
public bool IsFound { get; set; }
}
Parse the new episodes, linking them with any existing ones
public void ParseTVDB(Series ser)
{
var oldEps = ser.TVDBEpisodes.ToList();
foreach ( /*LOOP THROUGH FOUND EPISODES FROM TVDB */ )
{
string season = ;// parse season from website
string epnumber = ;// parse epnumber from website
TVDBEpisode ep;
// Find an episode that matches this one already in database
var oldEp = oldEps.FirstOrDefault((o) => o.Season == season && o.EpNumber == epnumber);
if (oldEp == null)
{
// Create new item
ep = new TVDBEpisode();
// link with series (auto adds new item to the database)
ser.TVDBEpisodes.Add(ep);
}
else
{
// Get the item already in the database so we can modify it
ep = oldEp;
}
ep.SeasonNumber = season;
ep.EpisodeNumber = epnumber;
//PARSE THE REST OF THE INFO FOR THE EPISODE
// Set IsFound to true, because this item has been updated
ep.IsFound = true;
// Delete any item that was not updated (IsFound == false)
// Note that this only works on a fresh series that has been wiped where all eps start as false
context.TVDBEpisodes.DeleteAllOnSubmit(oldEps.Where((t) => !t.IsFound));
context.SubmitChanges();
}
}
精彩评论