Current location - Quotes Website - Signature design - Detects whether a string contains a specific character set.
Detects whether a string contains a specific character set.
You need to check whether a specific set of characters appears in the string.

The simplest solution is clear, fast and universal (not only for strings, but also for any sequence; Not only for collections, but also for any container that can test membership):

Def containsAny(seq, Aset): ""Check where the seq contains any item in aset. """For c:return true return false in c:if aset in seq, some speed advantages can be obtained by adopting more advanced and complicated solutions, and the standard library module itertools is basically the same:

Import itertoolsdef contains Any(seq, aset): item(aset) in foritertools.ifilter. __contains__, Seq): return True return False Most problems related to collections are best solved by using the built-in collection types introduced by Python 2.4 (in Python 2.3, collections can be used. Set the type in the equivalent standard library). But there are exceptions As shown in the following example, a purely set-based scheme can be:

Def contains any (seq, aset): returns bool (set (aset). However, all items of seq in the scheme must be checked. The function in the "solution" column of this entry adopts the "short circuit" technology: once found, it will return immediately. Of course, if the result is false, then the function in the "solution" column still has to check seq? Otherwise, we can't confirm that every item in seq is not in aset. When the result is true, we can often find out the result quickly, because we only need to find one member who is a set. Of course, whether the above situation is worth considering depends entirely on the specific situation of the data. If the seq is short or the results are mostly false, then there is no substantial difference between the above two schemes; This difference is very important for a long sequence (usually the result is established quickly).

The advantage of the first edition of containsAny in the column of "Solutions" is that it is simple and clear, and the core idea is clearly expressed. The second version may look "smart", but "smart" is not a good word in the Python world, because the core values of the Python world are simple and clear. However, the second version is still worth considering, because it shows a more advanced scheme based on the standard library module itertools, and the more advanced scheme is often better than the lower scheme (although this is controversial in this entry). Itertools.ifilter receives a predicate and an Iterative Body, and generates a predicted item in the iterator. Here you are, any set. __contains__ is used as a forecast; When we write a statement in the form of anyset to test membership, anyset. __contains__ is a binding method that will be called inside the statement. Therefore, as long as any item in seq belongs to anyset, ifilter will generate it; Once this happens, we can return True immediately. If the code is executed after the for statement, it must mean that return True has never been executed, because any item of seq does not belong to anyset, so it should be return False.

-box begin- what is a "predicate"? "Predicate" is a term that people often encounter when discussing programming, which means "a function (or other callable object) that returns True or False". If the result returned by the predicate is true, the predicate is said to be satisfied. -box e n d- If your application needs a function like containsAny to check whether a string (or other sequence) contains members of a set, you may also need a variable like the following:

Definition contains only (sequence, ASET): ""When the sequence contains only the items in ASET, please check. """for c in seq: if c not in set: return false return true contains only has the same form as containsAny, but the logic is opposite. Other obviously similar functions essentially need to check all items, and the "short circuit" method is not applicable, so it is best to use the built-in set type (in Python 2.4) to deal with it (in Python 2.3, sets). Set can be used, the usage is the same):

Def containsAll(seq, Aset): ""Check whether the sequence seq contains all the items in aset. ""Returns not set (aset). Difference (seq) If you are not used to the difference method of set (or set). Set), pay attention to the semantics of this method: for any set A, a.difference(b) returns the set of all elements in A that do not belong to B (similar to a-set(b)). For example:

& gt>L 1 = [1,2,3,3] >>> L2 = [1,2,3,4] >>> settings (l1). Difference (L2) set ([]) >: & gt& gt setting (L2). Difference (L 1) set ([4]) I hope the above example will help to understand the following facts:

& gt& gt& gtcontainsAll(L 1,L2)False & gt; & gt& gtContainsAll(L2, L 1)True (in other words, please don't confuse difference with another method of set, symmetric_difference, which returns the set of all elements in A and B that belong to A but not to B, or belong to B but not to A).

Symmetric_difference Please refer to the following example:

& gt>L 1 = [1,2,3,5] >> L2 = [1,3,4,8] >> settings (l1). Symmetric difference (L2) set ([2,4,5,8]) > & gt& gt setting (L2). Symmetric _ difference (l1) set ([2, 4, 5, 8]) If the seq and aset you want to handle are just (simple, not Unicode) strings, the generality of the functions provided by this entry may not be completely needed. Consider adopting the more targeted scheme mentioned in Recipe 1. 10 (based on the translate method of string in the standard library and the function of string.maketrans). For example:

import string trans = string . make trans(',' ')# identity " translation " def contains any(astr,strset): return len(strset)! = len (strset。 Translate (notrans, astri)) def contains all (astri, astri): return notstrset. Translate (notrans, astri) is a slightly clever scheme, and its principle lies in: strset. Translate (notrans, astri) is composed of those "not belonging to astri" in strset. If this subsequence is the same length as strset, it means that strset.translate has not deleted any characters in strset, so it means that no characters in strset belong to astr. On the other hand, if the subsequence is empty, it means that strset.translate deletes all the characters in strset, so it means that all the characters in strset are also in astr. When you want to treat a string as a set of characters, you will naturally use the translate method because it is efficient, easy to use and flexible (see Recipe 1. 10 for details).

The two sets of solutions in this entry have very different generality. The former group of schemes is very general, not limited to string processing, and has few requirements for operating objects. The latter scheme based on tanslate method can only work if "astr and strset are both strings" or "the functions of astr and strset are very close to ordinary strings in appearance". Unicode strings are not suitable for the scheme based on the translate method, because the translate method of Unicode strings is different from the signature of the translate method of ordinary strings? The translate method of a Unicode string has only one parameter (the parameter is a dict object, which maps a code number to a Unicode string or nothing), while the translate method of an ordinary string has two parameters (two strings).

See the recipe1.10; Library references about strings and Unicode objects, Python's abbreviated translate method, and documentation of string module maketrans function; Documentation on built-in collection types (Python 2.4 and later only), collections and itertools modules, and special methods _ _ contain library references and _ _ content in Python. View the source of this article.