一 环境介绍

SQL  Server

PRINT @@VERSION
MicrosoftSQLServer2012-11.0.2100.60(X64)
Feb10201219:39:15
Copyright(c)MicrosoftCorporation
EnterpriseEdition:Core-basedLicensing(64-bit)onWindowsNT6.1(Build7601:ServicePack1) 操作系统
------------------
System Information
------------------
Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.130828-1532)
System Model: Aspire E1-471G
Processor: Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz (4 CPUs), ~2.6GHz
Memory: 4096MB RAM

二 实现功能

从一大堆有包括中文字符和编号的字符串中过滤出编号。

三 实现模拟

首先,我们准备測试数据,注意。这里的数据所有都是模拟数据,无实际含义。语句例如以下:

CREATE TABLE #temp
(
name VARCHAR(80)
); INSERT INTO #temp
VALUES ('五道口店3059'); INSERT INTO #temp
VALUES ('五羊邨店3060'); INSERT INTO #temp
VALUES ('杨家屯店3061'); INSERT INTO #temp
VALUES ('十里堤店3062'); INSERT INTO #temp
VALUES ('中关村店3063'); INSERT INTO #temp
VALUES ('丽秀店3064'); INSERT INTO #temp
VALUES ('石门店3065'); INSERT INTO #temp
VALUES ('黄村店3066'); INSERT INTO #temp
VALUES ('东圃店3067'); INSERT INTO #temp
VALUES ('天河店3068'); INSERT INTO #temp
VALUES ('人民路广场3069'); INSERT INTO #temp
VALUES ('社区中心3070'); INSERT INTO #temp
VALUES ('珠海市3071'); INSERT INTO #temp
VALUES ('丽都3072'); INSERT INTO #temp
VALUES ('晓月3073'); INSERT INTO #temp
VALUES ('旧区3074'); INSERT INTO #temp
VALUES ('新城3075'); INSERT INTO #temp
VALUES ('水井沟3076');

然后。我们观察数据,发现这些数据都有规律。编号是数字,占4个字符。

数字前面包括店、场、心、市、都、月、区、城、沟共9个字符。
我们试着採用SQL Server内置的函数Substring、Charindex、Rtrim、Ltrim过滤掉出现次数最多(店)的字符串。
语句例如以下:

SELECT Rtrim(Ltrim(Substring(name, Charindex('店', name) + 1, Len(name)))) AS name
INTO #t1
FROM #temp

下面是这几个函数的使用说明:

Substring

Returns the part of a character expression that starts at the specified position and has the specified length. The position parameter and the length parameter must evaluate to integers.

Syntax

SUBSTRING(character_expression, position, length)

Arguments

character_expression

Is a character expression from which to extract characters.

position

Is an integer that specifies where the substring begins.

length

Is an integer that specifies the length of the substring as number of characters.

Result Types

DT_WSTR

Charindex
Searches an expression for another expression and returns its starting position if found.

Syntax

CHARINDEX ( expressionToFind ,expressionToSearch [ , start_location ] )

Arguments
expressionToFind
Is a character expression that contains the sequence to be found. expressionToFind is limited to 8000 characters.
expressionToSearch
Is a character expression to be searched.
start_location
Is an integer or bigint expression at which the search starts. If start_location is not specified, is a negative number, or is 0, the search starts at the beginning of expressionToSearch.

Return Types
bigint if expressionToSearch is of the varchar(max), nvarchar(max), or varbinary(max) data types; otherwise, int.

Rtrim
Returns a character expression after removing trailing spaces.

RTRIM does not remove white space characters such as the tab or line feed characters. Unicode provides code points for many different types of spaces, but this function recognizes only the Unicode code point 0x0020. When double-byte character set (DBCS) strings are converted to Unicode they may include space characters other than 0x0020 and the function cannot remove such spaces. To remove all kinds of spaces, you can use the Microsoft Visual Basic .NET RTrim method in a script run from the Script component.

Syntax
RTRIM(character expression)
              
Arguments
character_expression
Is a character expression from which to remove spaces.

Result Types
DT_WSTR

Ltrim
Returns a character expression after removing leading spaces.

LTRIM does not remove white-space characters such as the tab or line feed characters. Unicode provides code points for many different types of spaces, but this function recognizes only the Unicode code point 0x0020. When double-byte character set (DBCS) strings are converted to Unicode they may include space characters other than 0x0020 and the function cannot remove such spaces. To remove all kinds of spaces, you can use the Microsoft Visual Basic .NET LTrim method in a script run from the Script component.

Syntax
LTRIM(character expression)
              
Arguments
character_expression
Is a character expression from which to remove spaces.

Result Types
DT_WSTR

好了,我们查看处理完后的结果。能够看到包括店的字符串已经所有过滤出编号。

SELECT * FROM #t1

3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
人民路广场3069
社区中心3070
珠海市3071
丽都3072
晓月3073
旧区3074
新城3075
水井沟3076

接着我们依次处理包括场、心、市、都、月、区、城、沟的字符串。语句和处理结果例如以下:

SELECT *
FROM #t1
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 人民路广场3069
社区中心3070
珠海市3071
丽都3072
晓月3073
旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('场', name) + 1, Len(name)))) AS name
INTO #t2
FROM #t1 SELECT *
FROM #t2
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 社区中心3070
珠海市3071
丽都3072
晓月3073
旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('心', name) + 1, Len(name)))) AS name
INTO #t3
FROM #t2 SELECT *
FROM #t3
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 珠海市3071
丽都3072
晓月3073
旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('市', name) + 1, Len(name)))) AS name
INTO #t4
FROM #t3 SELECT *
FROM #t4
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 丽都3072
晓月3073
旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('都', name) + 1, Len(name)))) AS name
INTO #t5
FROM #t4 SELECT *
FROM #t5
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 晓月3073
旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('月', name) + 1, Len(name)))) AS name
INTO #t6
FROM #t5 SELECT *
FROM #t6
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 旧区3074
新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('区', name) + 1, Len(name)))) AS name
INTO #t7
FROM #t6 SELECT *
FROM #t7
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 新城3075
水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('城', name) + 1, Len(name)))) AS name
INTO #t8
FROM #t7 SELECT *
FROM #t8
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN 水井沟3076 SELECT Rtrim(Ltrim(Substring(name, Charindex('沟', name) + 1, Len(name)))) AS name
INTO #t9
FROM #t8 SELECT *
FROM #t9
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN --无记录

这是终于的处理结果,过滤出编号后,我就能够利用这些编号和数据库表进行关联,获得想要的数据。

SELECT *
INTO #result
FROM #t9 SELECT *
FROM #result name
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076 SELECT s.xxx,
s.xxx
FROM xx s
JOIN #result r
ON s.xxx = r.name
WHERE s.xxx = 0;

四 总结

本文过滤编号实际上核心代码就两个。第一个是利用SQL Server的内置函数过滤出指定编号,语句例如以下:

SELECT Rtrim(Ltrim(Substring(name, Charindex('店', name) + 1, Len(name)))) AS name
INTO #t1
FROM #temp

第二个是推断是否包括中文。语句例如以下:

SELECT *
FROM #t1
WHERE name LIKE N'%[一-龥]%' COLLATE Chinese_PRC_BIN

在工作中,发现和总结这些小技巧会让你的工作事半功倍。

Good Luck!

SQL Server截取字符串和处理中文技巧的更多相关文章

  1. SQL Server截取字符串

    --SQL Server截取字符串 , Len('hello@163.com')) ,charindex('.','hello@163.com'))

  2. SQL Server 从字符串中提取中文、英文、数字

    --[提取中文] IF OBJECT_ID('dbo.fun_getCN') IS NOT NULL DROP FUNCTION dbo.fun_getCN GO create function db ...

  3. SQL Server截取字符串(经纬度)

    DECLARE @var VARCHAR(50) SET @var ='116.404556|39.915156' 方式一: SELECT CASE WHEN ISNULL(@var,'') < ...

  4. SQL Server:字符串函数

    以下所有例子均Studnet表为例: 1. len():计算字符串长度 len()用来计算字符串的长度,每个中文汉字或英文字母都为一个长度 select sname, len(sname) from ...

  5. SQL Server 分隔字符串函数实现

    在SQL Server中有时候也会遇到字符串进行分隔的需求.平时工作中常常遇到这样的需求,例如:人员数据表和人员爱好数据表,一条人员记录可以多多人员爱好记录,而往往人员和人员爱好在界面展示层要一并提交 ...

  6. .NET SQL Server连接字符串句法

    .NET SQL Server连接字符串句法 数据库的连接性已经发展成为应用程序开发的一个标准方面.数据库连接字符串现在已经成为每个项目的标准必备条件.我发现自己为了找到所需要的句法,经常要从另外一个 ...

  7. MS SQL Server 数据库连接字符串详解

    MS SQL Server 数据库连接字符串详解 原地址:http://blog.csdn.net/jhhja/article/details/6096565 问题 : 超时时间已到.在从池中获取连接 ...

  8. [转] SQL SERVER拼接字符串(字符串中有变量)

    本文转自:http://blog.csdn.net/sikaiyuan2008/article/details/7848926 SQL SERVER拼接字符串(字符串中有变量)对我来说是一个难点,总是 ...

  9. delphi连接sql server的字符串2011-10-11 16:07

    delphi连接sql server的字符串2011-10-11 16:07 一.delphi连接sql server 放一个连接组件 ADOConnection, 其它组件TADODataSet,T ...

随机推荐

  1. Spark Streaming连接TCP Socket

    1.Spark Streaming是什么 Spark Streaming是在Spark上建立的可扩展的高吞吐量实时处理流数据的框架,数据可以是来自多种不同的源,例如kafka,Flume,Twitte ...

  2. 移动平台WEB前端开发技巧汇总(转)

    最近我很关注移动前端的知识,但做为一个UI设计师和web前端工作人员没有这个工作环境接触,做为门外汉,网上系统的知识也了了,一直有种雾里看花的感觉,见到本文,我自己是奉为经典.所以我分享之后又专门打笔 ...

  3. IE读取并显示本地图像文件的方法

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...

  4. EAN 通用商品条形码

    商品条形码是指由一组规则排列的条.空及其对应字符组成的标识,用以表示一定的商品信息的符号.其中条为深色.空为纳色,用于条形码识读设备的扫描识读.其对应字符由一组阿拉伯数字组成,供人们直接识读或通过键盘 ...

  5. Nginx 开启 debug 日志的办法

    译序:一般来讲,Nginx 的错误日志级别是 error,作为 Nginx 用户来讲,你设置成 info 就足够用了.         但有时有些难以挖掘的 bug,需要看到更详细的 debug 级别 ...

  6. CMake 简单介绍

    CMake特点 CMake需要用户用CMake规范的语法编写CMake脚本,该语法简单易用,入门极其顺手 原生支持 C/C++/Fortran/Java 的相依性的自动分析功能,免除了程序员对代码依赖 ...

  7. IT第四天 - 运算符、随机数、Math类

    IT第四天 上午 运算符 1.%运算符的应用 2.运算符优先级:小括号 ! 算数运算符 关系运算符 && ||   赋值运算符 3.三元运算符:?表示条件为true的结果,:表示条件为 ...

  8. 16个值得个人站长做的广告联盟[转自cnzz]

    建站也有好多年了,也建了几个站,有些微波收入, 反复测试了挺多广告联盟, 下面介绍一下: 1.googleadsense联盟: 推荐指数:☆☆☆☆☆ Google广告联盟是现在信誉最好的广告提供商之一 ...

  9. 11997 - K Smallest Sums(优先队列)

    11997 - K Smallest Sums You’re given k arrays, each array has k integers. There are kk ways to pick ...

  10. VS2012 创建项目失败,,提示为找到约束。。。。

    首先查看 控制面板里已安装的更新 在Microsoft .NET Freamewofk 4.5 小 查看 是否有KB2833957和KB2840642这两个补丁(如下图)  如果有 卸载它 然后 下载 ...