ppl, I've to develop a Database开发者_如何学JAVA like this,
Here, I've a list of words. I need to keep relevancy for each other word in a database. when a new word added, I need to be able to add a row as well as a column.
One of my idea for this is like this,
CREATE TABLE tbl_Words
(
[WordID] BIGINT NOT NULL IDENTITY(1,1), // This s Primary Key
[Word] VARCHAR(250) NOT NULL, // This s Unique..
)
CREATE TABLE tbl_WordRelevancy
(
[RelID] BIGINT NOT NULL IDENTITY(1,1), // Primary Key
[Word1] VARCHAR(250) NOT NULL,
[Word2] VARCHAR(250) NOT NULL,
[Relevancy] DECIMAL NOT NULL,
)
but with this structure, if there are 100,000 words, in tbl_WordRelevancy table there will be 100,000*100,000 words. Its not good i think. (This database can grow upto 1M words in one day) Is it possible to maintain this thing using Relational Database structure ? or else What are the other ways to maintain this structure. ?
You're close.
CREATE TABLE tbl_Words
(
[WordID] BIGINT NOT NULL IDENTITY(1,1), // This s Primary Key
[Word] VARCHAR(250) NOT NULL, // This s Unique..
)
Comments don't make WordID a primary key or Word unique.
CREATE TABLE tbl_Words
(
[WordID] BIGINT IDENTITY(1,1) PRIMARY KEY,
[Word] VARCHAR(250) NOT NULL UNIQUE
);
But I think you're really looking for something more along these lines.
create table words (
word varchar(250) primary key
);
create table word_relevance (
word_a varchar(250) not null references words (word),
word_b varchar(250) not null references words (word),
primary key (word_a, word_b),
constraint ordered_words check (word_a <= word_b),
relevance integer not null check (relevance between 0 and 100)
);
A CHECK constraint requires ordering the words before inserting; there seems to be no point in storing both combinations "word 1, word 3" and "word 3, word 1". Since you're using whole numbers for percentages, you're probably better off with an integer than a decimal for relevance.
I don't think you're likely to load a million unique words a day. The second edition of the Oxford English Dictionary has full definitions for less than 175,000 words. your target language may vary, but still . . .
To create your report, use PIVOT and a very restrictive WHERE clause. No dbms is going to pivot into 175,000 columns. I suspect that no human will want to read more than a page or so--30 or 40 columns at most.
What you actually want is called a many to many relationship between words.
CREATE TABLE words (
word VARCHAR(250) PRIMARY KEY
);
CREATE TABLE word_relevancies (
leftword VARCHAR(250) REFERENCES words,
rightword VARCHAR(250) REFERENCES words,
relevance DECIMAL NOT NULL,
PRIMARY KEY (leftword, rightword)
);
In your original formulation, you mistook a relationship for a table. This is a common step in learning how to think in sets.
I strongly suggest you get a book on relational database design, such as Joe Celko's SQL for Smarties or Thinking in Sets and familiarize yourself with proper database design. You'll save yourself a lot of pain.
精彩评论