Melody~ Blogger^^ The Software sarhing space~*: **多語言系統數據庫**

**多語言系統數據庫**

*比如我们做一个给中国大陆&&

纽伦新港使用的系统，可以确定的

语言就是简体中文、繁体中文和英语，

而且可以确定以后也不会增加语言。

*确定以后是否需要增加语言**

这一点很重要，决定了在数据库

设计时，是否需要考虑多语上的扩展性**

先说在数据库设计时，可以有以下方案实现多语：
一、为每个多语字段建立对应语言的字段列~
比如我们有一个客户表，记录了客户Id、
客户名称、客户地址、客户电话等，其中
客户名称和客户地址是多语的，而且需要支持
简体中文、繁体中文和英语，于是我们可以
将客户表设计如下：

create table Client
(
    ClientId int primary key,
    NameChs nvarchar(50),
    NameCht nvarchar(50),
    NameEng varchar(200),
    AddressChs nvarchar(50),
    AddressCht nvarchar(50),
    AddressEng varchar(200),
    TelephoneNumber varchar(50)
)

**这样做的优点是容易理解，容易查询，一个
客户实例对应的就是数据库中的一条数据，
与普通的非多语数据库无异，而且由于没有
形成新的表，所以也不需要额外的Join，
所以查询效率很高：

insert into Client values(1,'工商银行','工商銀行','ICBC',

'中国北京','中國北京','China,Beijing','13811255555');

select * 
from Client c 
where c.ClientId=1

二、建立统一的翻译表，在翻译表中使用
多列存储多语言，然后在实体表中外键
引用翻译表。

create table Translation 
(
    TranslationId int primary key,
    TextChs nvarchar(200),
    TextCht nvarchar(200),
    TextEng varchar(200),
)

create table Client
(
    ClientId int primary key,
    NameTranId int references Translation(TranslationId),
    AddressTranId int references Translation(TranslationId),
    TelephoneNumber varchar(200)
)

这样要查询数据时，需要将Translation
表JOIN2次，获得对应的Name和
Address的多语。

insert into Translation values

(10,'工商银行','工商銀行','ICBC');
insert into Translation values

(20,'中国北京','中國北京','China,Beijing');
insert into Client values

(1,10,20,'13811255555');

select c.ClientId,c.TelephoneNumber,
tn.TextChs as NameChs,tn.TextCht as NameCht,tn.

TextEng as NameEng,
ta.TextChs as AddressChs,ta.TextCht as AddressCht,ta.

TextEng as AddressEng 
from Client c 
inner join Translation tn 
on c.NameTranId=tn.TranslationId 
inner join Translation ta 
on c.AddressTranId=ta.TranslationIdwhere 
where c.ClientId=1

**以上的方法都是将多语作为列输出,也就是
说有多少种语言就会有多少对于的列,不利于
语言的增加下面再介绍将语言以数据行的
形式保存的设计方法,这种方法可以在后期
任意增加语言而不改动数据库Schema.
三、将每个表中需要多语的字段独立出来，
形成一个对应的多语表。
多语表外键关联原表，每个需要多语的字段
在多语表中对应一列，多语表中增加“语言”
字段。同样以Client表为例，那么对应的表
结构是：

create table Client
(
    ClientId int primary key,
    TelephoneNumber varchar(200)
)
create table Client_MultiLanguages
(
    CLId int primary key,
    ClientId int references Client(ClientId),
    Name nvarchar(200),
    Address nvarchar(200),
    Language char(3)
)

这样的优点是便于扩展，在Schema中并没有
定义具体的语言，所以如果要增加语言的话，
只需要在多语表中增加一行对应的数据即可。
查询也相对比较简单，执行要将原表与对应的
多语表JOIN，然后跟上具体的语言作为
WHERE条件，即可完成对数据的查询，
比如要查询Id为1的Client对象的英语实例：

insert into Client values(1,'13811255555');
insert into Client_MultiLanguages values

(1,1,'工商银行','中国北京','CHS');
insert into Client_MultiLanguages values

(2,1,'工商銀行','中國北京','CHT');
insert into Client_MultiLanguages values

(3,1,'ICBC','China,Beijing','ENG');

select c.*,cm.Name,cm.Address 
from Client c inner join Client_MultiLanguages cm  
on c.ClientId=cm.ClientId  
where c.ClientId=1 and cm.Language='ENG'

四、建立统一翻译表和对应的多语表，
在每个多语列指向翻译表。

create table Translation 
(
    TranslationId int primary key
)

create table Client
(
    ClientId int primary key,
    NameTranId int references Translation(TranslationId),
    AddressTranId int references Translation(TranslationId),
    TelephoneNumber varchar(200)
)


create table TranslationEntity 
(
    TranslationEntityId int primary key,
    TranslationId int references Translation(TranslationId),
    Language char(3),
    TranslatedText nvarchar(200)
)

如果要查询Id为1的Client对应的英语实例，那么脚本为：

insert into Translation values(10);
insert into Translation values(20);
insert into Client values(1,10,20,'13811255555');
insert into TranslationEntity values

(1,10,'CHS','工商银行');
insert into TranslationEntity values

(2,10,'CHT','工商銀行');
insert into TranslationEntity values

(3,10,'ENG','ICBC');
insert into TranslationEntity values

(4,20,'CHS','中国北京');
insert into TranslationEntity values

(5,20,'CHT','中國北京');
insert into TranslationEntity values

(6,20,'ENG','China,Beijing');


select c.ClientId,tne.TranslatedText as Name,

tae.TranslatedText as Address,c.TelephoneNumber
from Client c
inner join TranslationEntity tne
on c.NameTranId=tne.TranslationId
inner join TranslationEntity tae
on c.AddressTranId=tae.TranslationId
where c.ClientId=1 and tne.Language='ENG' and

 tae.Language='ENG'

**这个数据的插入和查询也太复杂 !
同时也可以注意到在查询时根本没有用到
Translation表，其实这个表只是标识每个
数据实例中的多语字段，可以直接使用数据
库的Sequence生成或者使用GUID，只要
保证全局唯一即可。另外也可以注意到在
查询时JOIN了2次TranslationEntity 表，
如果一个表的多语字段比较多，比如有10个
字段有多语，那么查询是就需要JOIN10次，
这个效率会很低。另外还可以注意到，
在WHERE条件中写了2次Language='ENG'，
如果多个多语字段，那么就要写多次。刚才
这个查询写的不够严谨，因为不能保证Name
字段和Address字段必然就有英文值，如果没
有英文值会导致查询结果为空，所以正确的
写法应该是：

select c.ClientId,tne.TranslatedText as Name,tae.

TranslatedText as Address,c.TelephoneNumber
from Client c
left join TranslationEntity tne
on c.NameTranId=tne.TranslationId  and tne.

Language='ENG'left join TranslationEntity tae
on c.AddressTranId=tae.TranslationId and tae.

Language='ENG'where c.ClientId=1

实际项目中,如果我们使用了NHibernate等
ORMapping工具,那么多语字段就会映射成一
个集合,所以对于某种语言的实例,那么需要
执行N+1次SQL查询,而不是JOIN查询,N是该
对象中多语的属性个数 ***~

********************************************************

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

Melody~ Blogger^^ The Software sarhing space~*

首頁

2013年6月5日星期三

多語言系統數據庫

沒有留言:

張貼留言

首頁

2013年6月5日 星期三

**多語言系統數據庫**

沒有留言:

張貼留言

2013年6月5日星期三

多語言系統數據庫