题目如下:

Given a list of directory info including directory path, and all the files with contents in this directory, you need to find out all the groups of duplicate files in the file system in terms of their paths.

A group of duplicate files consists of at least two files that have exactly the same content.

A single directory info string in the input list has the following format:

"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"

It means there are n files (f1.txtf2.txt ... fn.txt with content f1_contentf2_content ... fn_content, respectively) in directory root/d1/d2/.../dm. Note that n >= 1 and m >= 0. If m = 0, it means the directory is just the root directory.

The output is a list of group of duplicate file paths. For each group, it contains all the file paths of the files that have the same content. A file path is a string that has the following format:

"directory_path/file_name.txt"

Example 1:

Input:
["root/a 1.txt(abcd) 2.txt(efgh)", "root/c 3.txt(abcd)", "root/c/d 4.txt(efgh)", "root 4.txt(efgh)"]
Output:
[["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]

Note:

  1. No order is required for the final output.
  2. You may assume the directory name, file name and file content only has letters and digits, and the length of file content is in the range of [1,50].
  3. The number of files given is in the range of [1,20000].
  4. You may assume no files or directories share the same name in the same directory.
  5. You may assume each given directory info represents a unique directory. Directory path and file info are separated by a single blank space.

解题思路:我的方法非常简单,解析输入的字符串,把文件内容作为key,文件路径作为value存入字典中,当然value是用list保存的。最后遍历字典,value超过两个元素即为符合条件的结果。

代码如下:

class Solution(object):
def findDuplicate(self, paths):
"""
:type paths: List[str]
:rtype: List[List[str]]
"""
dic = {}
for p in paths:
item = p.split(' ')
for i in range(1,len(item)):
file_name = item[i].split('(')[0]
file_content = item[i].split('(')[1][:-1]
dic[file_content] = dic.setdefault(file_content,[]) + [item[0] + '/' + file_name]
res = []
for v in dic.itervalues():
if len(v) >= 2:
res.append(v)
return res

【leetcode】609. Find Duplicate File in System的更多相关文章

  1. 【LeetCode】609. Find Duplicate File in System 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  2. 【LeetCode】652. Find Duplicate Subtrees 解题报告(Python)

    [LeetCode]652. Find Duplicate Subtrees 解题报告(Python) 标签(空格分隔): LeetCode 作者: 负雪明烛 id: fuxuemingzhu 个人博 ...

  3. LC 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  4. 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  5. 【LeetCode】316. Remove Duplicate Letters 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  6. 【LeetCode】388. Longest Absolute File Path 解题报告(Python)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述: 题目大意 解题方法 日期 题目地址:https://leetcode. ...

  7. 【LeetCode】217. Contains Duplicate (2 solutions)

    Contains Duplicate Given an array of integers, find if the array contains any duplicates. Your funct ...

  8. 【leetcode】316. Remove Duplicate Letters

    题目如下: Given a string which contains only lowercase letters, remove duplicate letters so that every l ...

  9. 【leetcode】388. Longest Absolute File Path

    题目如下: Suppose we abstract our file system by a string in the following manner: The string "dir\ ...

随机推荐

  1. Java机试题目

    1.生成一个随机四位数,每位数字不重复. package com.cloud.stagging.lhcloudzuul; import java.util.Random; /** * 1.生成一个随机 ...

  2. 麦子lavarel---16、日志

    麦子lavarel---16.日志 一.总结 一句话总结: 一定要养成打印日志查看日志的好习惯,非常节约时间和便于查错 1.console.Log(“获取类别数据:"):console.Lo ...

  3. 这才是Tomcat内存配置的正确姿势

    1.背景 虽然阅读了各大牛的博客或文章,但并没有找到特别全面的关于JVM内存分配方法的文章,很多都是复制黏贴 为了严谨,本文特别备注只介绍基于HotSpot VM虚拟机,并且基于JDK1.7的内存分配 ...

  4. rap安装mysql

    1.yum仓库下载MySQL: yum localinstall https://repo.mysql.com//mysql80-community-release-el7-1.noarch.rpm ...

  5. JS正则表达式校验金额

    //任意正整数,正小数(小数位不超过2位) var isNum=/^(([1-9][0-9]*)|(([0]\.\d{1,2}|[1-9][0-9]*\.\d{1,2})))$/; var num = ...

  6. 排序,其他的运用 os fork

    while True: str_num = input("Enter number:") flag = True dotCount = 0 if str_num[0] == '-' ...

  7. HDFS-Suffle

    一.Shuffle机制 1.官网图 2.MR确保每个Reducer的输入都是按照key排序的.系统执行排序的过程(即将Mapper输出作为输入传给Reducer)成为Shuffle 二.Partiti ...

  8. 转 router-view 的理解

    主要是构建 SPA (单页应用) 时,方便渲染你指定路由对应的组件.你可以 router-view 当做是一个容器,它渲染的组件是你使用 vue-router 指定的.比如: 视图层: <div ...

  9. 洛谷 P1546 最短网络 Agri-Net(最小生成树)

    题目链接 https://www.luogu.org/problemnew/show/P1546 说过了不复制内容了 显然是个最小生成树. 解题思路 prim算法 Kruskal算法 prim算法很直 ...

  10. G a+b+c+d=?

    G a+b+c+d=? 链接:https://ac.nowcoder.com/acm/contest/338/G来源:牛客网 题目描述 This is a very simple problem! Y ...