Busybox支持中文的解决办法

转载：http://blog.csdn.net/wavemcu/article/details/7202908

***************************************************************************************************************************
作者：EasyWave 时间：2012.01.15

类别：linux驱动开发声明：转载，请保留链接

***************************************************************************************************************************

在嵌入式linux系统中，busybox是最常见的用来构建文件系统的。可是从busybox1.17.0以上之后，对ls命令不做修改是无法显
示中文的。就算是内核设置了支持中文的话，在shell下用ls命令也是无法显示中文的，这是因为busybox1.17.0以后版本对中文的支持进行了
限制。现在就来讲讲如何修改让busybox1.17.0以上版本支持中文，要想让busybox1.17.0以上支持中文，需要修改两个文
件：printable_string.c以及unicode.c
。下面来分析，为什么ls命令无法显示中文。请看printable_string.c未修改过的代码：

const char* FAST_FUNC printable_string(uni_stat_t *stats, const char *str)

{

    static char *saved[];

    static unsigned cur_saved; /* = 0 */

    char *dst;

    const char *s;

    s = str;

    while () {

        unsigned char c = *s;

        if (c == '\0') {

            /* 99+% of inputs do not need conversion */

            if (stats) {

                stats->byte_count = (s - str);

                stats->unicode_count = (s - str);

                stats->unicode_width = (s - str);

            }

            return str;

        }

        if (c < ' ')

            break;

        if (c >= 0x7f)

            break;

        s++;

    }

#if ENABLE_UNICODE_SUPPORT

    dst = unicode_conv_to_printable(stats, str);

#else

    {

        char *d = dst = xstrdup(str);

        while () {

            unsigned char c = *d;

            if (c == '\0')

                break;

                if (c < ' ' || c >= 0x7f)

                   *d = '?';

            d++;

        }

        if (stats) {

            stats->byte_count = (d - dst);

            stats->unicode_count = (d - dst);

            stats->unicode_width = (d - dst);

        }

    }

#endif

    free(saved[cur_saved]);

    saved[cur_saved] = dst;

    cur_saved = (cur_saved + ) & (ARRAY_SIZE(saved)-);

    return dst;

}

从上面代码23和24行以及37和38行可以看出：大于0x7F的字符直接被break掉，或者直接被“？”代替了。所以就算是linux内核设置了支持中文，也是无法显示出来的，被“？”代替了。修改红色加粗的代码如下：

const char* FAST_FUNC printable_string(uni_stat_t *stats, const char *str)

{

    static char *saved[];

    static unsigned cur_saved; /* = 0 */

    char *dst;

    const char *s;

    s = str;

    while () {

        unsigned char c = *s;

        if (c == '\0') {

            /* 99+% of inputs do not need conversion */

            if (stats) {

                stats->byte_count = (s - str);

                stats->unicode_count = (s - str);

                stats->unicode_width = (s - str);

            }

            return str;

        }

        if (c < ' ')

            break;

    /*

        if (c >= 0x7f)

            break;

    */

        s++;

    }

#if ENABLE_UNICODE_SUPPORT

    dst = unicode_conv_to_printable(stats, str);

#else

    {

        char *d = dst = xstrdup(str);

        while () {

            unsigned char c = *d;

            if (c == '\0')

                break;

            if (c < ' ' /*|| c >= 0x7f */)

                *d = '?';

            d++;

        }

        if (stats) {

            stats->byte_count = (d - dst);

            stats->unicode_count = (d - dst);

            stats->unicode_width = (d - dst);

        }

    }

#endif

    free(saved[cur_saved]);

    saved[cur_saved] = dst;

    cur_saved = (cur_saved + ) & (ARRAY_SIZE(saved)-);

    return dst;

}

经过以上的修改之后，同时busybox1.17.0配置的时候没有选中[] Support Unicode的话，那么采用ls命令是可以看到中文的，这个我自己已经亲自测试过的。可是还有一种情况：busybox1.17.0在配置的时候选中了：[*] Support Unicode，见下：

在配置里，有Support Unicode选上的：

Busybox Settings->General Configuration->

   │ │[ ] Enable locale support (system needs locale for this to work)     │ │

   │ │[*] Support Unicode                                                  │ │

   │ │[*] Support for --long-options                                       │ │

那么这样还需要修改一个文件，这个文件就是：unicode.c。如果不修改这个文件，ls命令也是无法显示出中文的。见下未修改的代码：

static char* FAST_FUNC unicode_conv_to_printable2(uni_stat_t *stats, const char *src, unsigned width, int flags)

{

    char *dst;

    unsigned dst_len;

    unsigned uni_count;

    unsigned uni_width;

    if (unicode_status != UNICODE_ON) {

        char *d;

        if (flags & UNI_FLAG_PAD) {

            d = dst = xmalloc(width + );

            while ((int)--width >= ) {

                unsigned char c = *src;

                if (c == '\0') {

                    do

                        *d++ = ' ';

                    while ((int)--width >= );

                    break;

                }

                *d++ = (c >= ' ' && c < 0x7f) ? c : '?';

                src++;

            }

            *d = '\0';

        } else {

            d = dst = xstrndup(src, width);

            while (*d) {

                unsigned char c = *d;

                if (c < ' ' || c >= 0x7f)

                    *d = '?';

                d++;

            }

        }

        if (stats) {

            stats->byte_count = (d - dst);

            stats->unicode_count = (d - dst);

            stats->unicode_width = (d - dst);

        }

        return dst;

    }

    dst = NULL;

    uni_count = uni_width = ;

    dst_len = ;

    while () {

        int w;

        wchar_t wc;

#if ENABLE_UNICODE_USING_LOCALE

        {

            mbstate_t mbst = {  };

            ssize_t rc = mbsrtowcs(&wc, &src, , &mbst);

            /* If invalid sequence is seen: -1 is returned,

             * src points to the invalid sequence, errno = EILSEQ.

             * Else number of wchars (excluding terminating L'\0')

             * written to dest is returned.

             * If len (here: 1) non-L'\0' wchars stored at dest,

             * src points to the next char to be converted.

             * If string is completely converted: src = NULL.

             */

            if (rc == ) /* end-of-string */

                break;

            if (rc < ) { /* error */

                src++;

                goto subst;

            }

            if (!iswprint(wc))

                goto subst;

        }

#else

        src = mbstowc_internal(&wc, src);

        /* src is advanced to next mb char

         * wc == ERROR_WCHAR: invalid sequence is seen

         * else: wc is set

         */

        if (wc == ERROR_WCHAR) /* error */

            goto subst;

        if (wc == ) /* end-of-string */

            break;

#endif

        if (CONFIG_LAST_SUPPORTED_WCHAR && wc > CONFIG_LAST_SUPPORTED_WCHAR)

            goto subst;

        w = wcwidth(wc);

        if ((ENABLE_UNICODE_COMBINING_WCHARS && w < ) /* non-printable wchar */

         || (!ENABLE_UNICODE_COMBINING_WCHARS && w <= )

         || (!ENABLE_UNICODE_WIDE_WCHARS && w > )

        ) {

 subst:

            wc = CONFIG_SUBST_WCHAR;

            w = ;

        }

        width -= w;

        /* Note: if width == 0, we still may add more chars,

         * they may be zero-width or combining ones */

        if ((int)width < ) {

            /* can't add this wc, string would become longer than width */

            width += w;

            break;

        }

        uni_count++;

        uni_width += w;

        dst = xrealloc(dst, dst_len + MB_CUR_MAX);

#if ENABLE_UNICODE_USING_LOCALE

        {

            mbstate_t mbst = {  };

            dst_len += wcrtomb(&dst[dst_len], wc, &mbst);

        }

#else

        dst_len += wcrtomb_internal(&dst[dst_len], wc);

#endif

    }

    /* Pad to remaining width */

    if (flags & UNI_FLAG_PAD) {

        dst = xrealloc(dst, dst_len + width + );

        uni_count += width;

        uni_width += width;

        while ((int)--width >= ) {

            dst[dst_len++] = ' ';

        }

    }

    dst[dst_len] = '\0';

    if (stats) {

        stats->byte_count = dst_len;

        stats->unicode_count = uni_count;

        stats->unicode_width = uni_width;

    }

    return dst;

}

见上面20行和28行，需要修改一下，修改后的代码见下：

static char* FAST_FUNC unicode_conv_to_printable2(uni_stat_t *stats, const char *src, unsigned width, int flags)

{

    char *dst;

    unsigned dst_len;

    unsigned uni_count;

    unsigned uni_width;

    if (unicode_status != UNICODE_ON) {

        char *d;

        if (flags & UNI_FLAG_PAD) {

            d = dst = xmalloc(width + );

            while ((int)--width >= ) {

                unsigned char c = *src;

                if (c == '\0') {

                    do

                        *d++ = ' ';

                    while ((int)--width >= );

                    break;

                }

                *d++ = (c >= ' '/* && c < 0x7f */) ? c : '?';

                src++;

            }

            *d = '\0';

        } else {

            d = dst = xstrndup(src, width);

            while (*d) {

                unsigned char c = *d;

                if (c < ' '/* || c >= 0x7f */)

                    *d = '?';

                d++;

            }

        }

        if (stats) {

            stats->byte_count = (d - dst);

            stats->unicode_count = (d - dst);

            stats->unicode_width = (d - dst);

        }

        return dst;

    }

    dst = NULL;

    uni_count = uni_width = ;

    dst_len = ;

    while () {

        int w;

        wchar_t wc;

#if ENABLE_UNICODE_USING_LOCALE

        {

            mbstate_t mbst = {  };

            ssize_t rc = mbsrtowcs(&wc, &src, , &mbst);

            /* If invalid sequence is seen: -1 is returned,

             * src points to the invalid sequence, errno = EILSEQ.

             * Else number of wchars (excluding terminating L'\0')

             * written to dest is returned.

             * If len (here: 1) non-L'\0' wchars stored at dest,

             * src points to the next char to be converted.

             * If string is completely converted: src = NULL.

             */

            if (rc == ) /* end-of-string */

                break;

            if (rc < ) { /* error */

                src++;

                goto subst;

            }

            if (!iswprint(wc))

                goto subst;

        }

#else

        src = mbstowc_internal(&wc, src);

        /* src is advanced to next mb char

         * wc == ERROR_WCHAR: invalid sequence is seen

         * else: wc is set

         */

        if (wc == ERROR_WCHAR) /* error */

            goto subst;

        if (wc == ) /* end-of-string */

            break;

#endif

        if (CONFIG_LAST_SUPPORTED_WCHAR && wc > CONFIG_LAST_SUPPORTED_WCHAR)

            goto subst;

        w = wcwidth(wc);

        if ((ENABLE_UNICODE_COMBINING_WCHARS && w < ) /* non-printable wchar */

         || (!ENABLE_UNICODE_COMBINING_WCHARS && w <= )

         || (!ENABLE_UNICODE_WIDE_WCHARS && w > )

        ) {

 subst:

            wc = CONFIG_SUBST_WCHAR;

            w = ;

        }

        width -= w;

        /* Note: if width == 0, we still may add more chars,

         * they may be zero-width or combining ones */

        if ((int)width < ) {

            /* can't add this wc, string would become longer than width */

            width += w;

            break;

        }

        uni_count++;

        uni_width += w;

        dst = xrealloc(dst, dst_len + MB_CUR_MAX);

#if ENABLE_UNICODE_USING_LOCALE

        {

            mbstate_t mbst = {  };

            dst_len += wcrtomb(&dst[dst_len], wc, &mbst);

        }

#else

        dst_len += wcrtomb_internal(&dst[dst_len], wc);

#endif

    }

    /* Pad to remaining width */

    if (flags & UNI_FLAG_PAD) {

        dst = xrealloc(dst, dst_len + width + );

        uni_count += width;

        uni_width += width;

        while ((int)--width >= ) {

            dst[dst_len++] = ' ';

        }

    }

    dst[dst_len] = '\0';

    if (stats) {

        stats->byte_count = dst_len;

        stats->unicode_count = uni_count;

        stats->unicode_width = uni_width;

    }

    return dst;

}

经过以上修改之后，就算配置支持Unicode，ls命令也是可以支持中文的。同时也可以进入中文目录可以文件夹。

Busybox支持中文的解决办法的更多相关文章

mac中matplotlib不支持中文的解决办法
参考:https://blog.csdn.net/kaizei_pao/article/details/80795377 首先查看matplotlib已加载的字体: import matplotlib ...
python---不支持中文注释解决办法
很神奇的一件事儿,pycharm不支持中文注释,具体解决办法: #-*- coding: utf- -*- 具体使用:
JqueryQrcode生成二维码不支持中文的解决办法
JqueryQrcode.js有一个小小的缺点,就是默认不支持中文. 这跟js的机制有关系,jquery-qrcode这个库是采用 charCodeAt() 这个方式进行编码转换的, 而这个方法默认会 ...
使用iTextSharp 解析html生成pdf，xmlworker不支持中文的解决办法
http://www.micmiu.com/opensource/expdoc/itext-xml-worker-cn/ 参考上面的文章,虽然是java的,但是和.net是对应的. 下载 html ...
IDLE3.6.3 Mac版不支持中文输入解决办法
最近安装了IDLE 3.6.3版本但是在IDLE中要输入中文注释时发现虽然输入法切换到了中文,但输入的还是英文.然后我在IDLE外试了下,输入中文没问题,于是就确认应该是IDLE的问题. 网上查询到 ...
koala不支持中文的解决办法（问题出现在使用中文字体时报错）
C:\Program Files\Koala\rubygems\gems\sass-3.4.9\lib\sass 这是我的koala的安装路径,在sass文件夹下打开engine.rb(文本文档打开即 ...
Ubantu里面的Sublime Text3不支持中文的解决办法
参考的大佬链接:https://github.com/lyfeyaj/sublime-text-imfix 更新然后将系统升级到最新版本,在linux终端输入 sudo apt-get update ...
[Linux] - CentOS中文乱码解决办法
CentOS 7 终端中文乱码解决办法: 1.使用vim编辑locale.config文件: vim /etc/locale.conf 2.将LANG="en_US.UTF-8"修 ...
Oracle导入中文乱码解决办法
Oracle导入中文乱码解决办法一.确保各个客户端字符集的编码同服务器字符集编码一致 1- 确定sqlplus字符集编码,如果是windows设置环境变量. 2- 确保Sec ...

随机推荐

cocos2d-js屏幕任何位置点击开始的实现
ctor:function () { this._super(); if ('mouse' in cc.sys.capabilities) cc.eventManager.addListener({ ...
mediawiki 的使用
首先,程序里会先加载 includes/DefaultSettings.php,然后再加载 LocalSettings.php,这样定义一些权限.其中 DefaultSettings.php 是默认的 ...
第二百三十三天 how can I 坚持
刚才看了场球,亚冠恒大和迪拜阿尔阿赫利,1:0,刚打开电脑就看到了进球,还是很幸运的. 在家待了一天,阴天,预报明天又中到大雪啊,下吧.好希望下场大雪啊. 最近一直感觉好累,写代码不容易啊 ,还是因 ...
C++Bulder DataSnap 内存泄露元凶
DSServerClass1 DSServerClass1DestroyInstance void __fastcall TServerContainer1::DSServerClass1Destro ...
【转】jsp页面中jstl标签详解
原文地址: JSLT标签库,是日常开发经常使用的,也是众多标签中性能最好的.把常用的内容,放在这里备份一份,随用随查.尽量做到不用查,就可以随手就可以写出来.这算是Java程序员的基本功吧,一定要扎实 ...
通过ModuleImplAdvertisement向自定义服务传递参数
无意中发现通过ModuleImplAdvertisement可以向自定义服务传递参数,有空试一试. —————————————————————————————————————————————————— ...
<转载>linux下内存泄露查找、BUG调试
先收藏着,抽空好好看看:http://www.ibm.com/developerworks/cn/linux/l-pow-debug/ 简介调试程序有很多方法,例如向屏幕上打印消息,使用调试器,或者 ...
apache配置虚拟主机后，启动速度慢
apache配置虚拟主机后,启动速度慢且提示“the requested operation has failed” 可以通过在cmd下启动,来查找问题(命令中的“apache2.2”,是服务名,根据 ...
KextWizard 的使用方法；以及Kext安装的几种工具下载
a.将你需要安装的Kext拖到非中文的路径中: b.运行该软件,将Kext拖入下图对应的方框里,然后选择位置安装: c.选择修复权限和重建缓存(一个是修复Extra文件夹,一个是修复SLE) Kext ...
缓存需要注意的问题以及使用.net正则替换字符串的方法
参考资料:http://www.infoq.com/cn/news/2015/09/cache-problems 正则替换字符串的简单方法: var regTableType = new Regex( ...

Busybox支持中文的解决办法

Busybox支持中文的解决办法的更多相关文章

随机推荐

热门专题