Файловый менеджер - Редактировать - /home/digitalm/venv/lib/python3.7/site-packages/pandas/core/groupby/__pycache__/groupby.cpython-37.pyc
Назад
B �5�g<� � @ s d Z ddlmZ ddlmZ ddlZddlmZmZ ddl Z ddl mZ ddlZddl mZmZmZmZmZmZmZmZmZmZmZ ddlZddlZddlmZ dd lmZm Z ddl!m" m#Z$ dd l%m&Z&m'Z'm(Z(m)Z)m*Z*m+Z+m,Z,m-Z- ddl.m/Z0 ddl1m2Z2 dd l3m4Z4m5Z5m6Z6m7Z7 ddl8m9Z9m:Z:m;Z;m<Z<m=Z=m>Z>m?Z?m@Z@ ddlAmBZBmCZC ddlDmEZE ddlFmG mHZH ddlImJZJmKZKmLZLmMZM ddlNmOZOmPZPmQZQ ddlRmG mSZT ddlUmVZV ddlWmXZX ddlYmZZZm[Z[m\Z\ ddl]m^Z^m_Z_m`Z` ddlambZb ddlcmdZd ddlemfZf ddlgmhZhmiZi e�r4ddl mjZj dZkdddd �Zld!Zmd"Znd#Zod$Zpe-G d%d&� d&eP��Zqed'd(d)�d*d+��Zreeee eegef eeegef eeef f ZsG d,d-� d-ePeQe( �Zted.eXd/�ZuG d0d'� d'ete( �Zve7ev�d;d3d4d5d6d7d7d7d7d7d7d7d'd8�d9d:��ZwdS )<a Provide the groupby split-apply-combine paradigm. Define the GroupBy class providing the base-class of operations. The SeriesGroupBy and DataFrameGroupBy sub-class (defined in pandas.core.groupby.generic) expose these user-facing objects to provide specific functionality. � )�annotations)�contextmanagerN)�partial�wraps)�dedent)� TYPE_CHECKING�Callable�Hashable�Iterable�Iterator�List�Mapping�Sequence�TypeVar�Union�cast)�option_context)� Timestamp�lib)� ArrayLike�F� FrameOrSeries�FrameOrSeriesUnion� IndexLabel�Scalar�T�final)�function)�AbstractMethodError)�Appender�Substitution�cache_readonly�doc)� is_bool_dtype�is_datetime64_dtype�is_float_dtype�is_integer_dtype�is_numeric_dtype�is_object_dtype� is_scalar�is_timedelta64_dtype)�isna�notna)�nanops)�BaseMaskedArray�BooleanArray�Categorical�ExtensionArray)� DataError�PandasObject�SelectionMixin)� DataFrame)�NDFrame)�base�numba_�ops)�CategoricalIndex�Index� MultiIndex)�ensure_block_shape)�Series)�get_group_index_sorter)�NUMBA_FUNC_CACHE�maybe_use_numba)�Literalz� See Also -------- Series.%(name)s : Apply a function %(name)s to a Series. DataFrame.%(name)s : Apply a function %(name)s to each row or column of a DataFrame. a� Apply function ``func`` group-wise and combine the results together. The function passed to ``apply`` must take a {input} as its first argument and return a DataFrame, Series or scalar. ``apply`` will then take care of combining the results back together into a single dataframe or series. ``apply`` is therefore a highly flexible grouping method. While ``apply`` is a very flexible method, its downside is that using it can be quite a bit slower than using more specific methods like ``agg`` or ``transform``. Pandas offers a wide range of method that will be much faster than using ``apply`` for their specific purposes, so try to use them before reaching for ``apply``. Parameters ---------- func : callable A callable that takes a {input} as its first argument, and returns a dataframe, a series or a scalar. In addition the callable may take positional and keyword arguments. args, kwargs : tuple and dict Optional positional and keyword arguments to pass to ``func``. Returns ------- applied : Series or DataFrame See Also -------- pipe : Apply function to the full GroupBy object instead of to each group. aggregate : Apply aggregate function to the GroupBy object. transform : Apply function column-by-column to the GroupBy object. Series.apply : Apply a function to a Series. DataFrame.apply : Apply a function to each row or column of a DataFrame. Notes ----- In the current implementation ``apply`` calls ``func`` twice on the first group to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if ``func`` has side-effects, as they will take effect twice for the first group. .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``, see the examples below. Examples -------- {examples} aY >>> df = pd.DataFrame({'A': 'a a b'.split(), ... 'B': [1,2,3], ... 'C': [4,6,5]}) >>> g = df.groupby('A') Notice that ``g`` has two groups, ``a`` and ``b``. Calling `apply` in various ways, we can get different grouping results: Example 1: below the function passed to `apply` takes a DataFrame as its argument and returns a DataFrame. `apply` combines the result for each group together into a new DataFrame: >>> g[['B', 'C']].apply(lambda x: x / x.sum()) B C 0 0.333333 0.4 1 0.666667 0.6 2 1.000000 1.0 Example 2: The function passed to `apply` takes a DataFrame as its argument and returns a Series. `apply` combines the result for each group together into a new DataFrame. .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``. >>> g[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min()) B C A a 1.0 2.0 b 0.0 0.0 Example 3: The function passed to `apply` takes a DataFrame as its argument and returns a scalar. `apply` combines the result for each group together into a Series, including setting the index as appropriate: >>> g.apply(lambda x: x.C.max() - x.B.min()) A a 5 b 2 dtype: int64a� >>> s = pd.Series([0, 1, 2], index='a a b'.split()) >>> g = s.groupby(s.index) From ``s`` above we can see that ``g`` has two groups, ``a`` and ``b``. Calling `apply` in various ways, we can get different grouping results: Example 1: The function passed to `apply` takes a Series as its argument and returns a Series. `apply` combines the result for each group together into a new Series. .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``. >>> g.apply(lambda x: x*2 if x.name == 'a' else x/2) a 0.0 a 2.0 b 1.0 dtype: float64 Example 2: The function passed to `apply` takes a Series as its argument and returns a scalar. `apply` combines the result for each group together into a Series, including setting the index as appropriate: >>> g.apply(lambda x: x.max() - x.min()) a 1 b 0 dtype: int64)�template�dataframe_examplesZseries_examplesa� Compute {fname} of group values. Parameters ---------- numeric_only : bool, default {no} Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. min_count : int, default {mc} The required number of valid values to perform the operation. If fewer than ``min_count`` non-NA values are present the result will be NA. Returns ------- Series or DataFrame Computed {fname} of values within each group. aa Apply a function `func` with arguments to this %(klass)s object and return the function's result. Use `.pipe` when you want to improve readability by chaining together functions that expect Series, DataFrames, GroupBy or Resampler objects. Instead of writing >>> h(g(f(df.groupby('group')), arg1=a), arg2=b, arg3=c) # doctest: +SKIP You can write >>> (df.groupby('group') ... .pipe(f) ... .pipe(g, arg1=a) ... .pipe(h, arg2=b, arg3=c)) # doctest: +SKIP which is much more readable. Parameters ---------- func : callable or tuple of (callable, str) Function to apply to this %(klass)s object or, alternatively, a `(callable, data_keyword)` tuple where `data_keyword` is a string indicating the keyword of `callable` that expects the %(klass)s object. args : iterable, optional Positional arguments passed into `func`. kwargs : dict, optional A dictionary of keyword arguments passed into `func`. Returns ------- object : the return type of `func`. See Also -------- Series.pipe : Apply a function with arguments to a series. DataFrame.pipe: Apply a function with arguments to a dataframe. apply : Apply function to each group instead of to the full %(klass)s object. Notes ----- See more `here <https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#piping-function-calls>`_ Examples -------- %(examples)s a� Call function producing a like-indexed %(klass)s on each group and return a %(klass)s having the same indexes as the original object filled with the transformed values Parameters ---------- f : function Function to apply to each group. Can also accept a Numba JIT function with ``engine='numba'`` specified. If the ``'numba'`` engine is chosen, the function must be a user defined function with ``values`` and ``index`` as the first and second arguments respectively in the function signature. Each group's index will be passed to the user defined function and optionally available for use. .. versionchanged:: 1.1.0 *args Positional arguments to pass to func. engine : str, default None * ``'cython'`` : Runs the function through C-extensions from cython. * ``'numba'`` : Runs the function through JIT compiled code from numba. * ``None`` : Defaults to ``'cython'`` or the global setting ``compute.use_numba`` .. versionadded:: 1.1.0 engine_kwargs : dict, default None * For ``'cython'`` engine, there are no accepted ``engine_kwargs`` * For ``'numba'`` engine, the engine can accept ``nopython``, ``nogil`` and ``parallel`` dictionary keys. The values must either be ``True`` or ``False``. The default ``engine_kwargs`` for the ``'numba'`` engine is ``{'nopython': True, 'nogil': False, 'parallel': False}`` and will be applied to the function .. versionadded:: 1.1.0 **kwargs Keyword arguments to be passed into func. Returns ------- %(klass)s See Also -------- %(klass)s.groupby.apply : Apply function ``func`` group-wise and combine the results together. %(klass)s.groupby.aggregate : Aggregate using one or more operations over the specified axis. %(klass)s.transform : Call ``func`` on self producing a %(klass)s with transformed values. Notes ----- Each group is endowed the attribute 'name' in case you need to know which group you are working on. The current implementation imposes three requirements on f: * f must return a value that either has the same shape as the input subframe or can be broadcast to the shape of the input subframe. For example, if `f` returns a scalar it will be broadcast to have the same shape as the input subframe. * if this is a DataFrame, f must support application column-by-column in the subframe. If f also supports application to the entire subframe, then a fast path is used starting from the second chunk. * f must not mutate groups. Mutation is not supported and may produce unexpected results. See :ref:`gotchas.udf-mutation` for more details. When using ``engine='numba'``, there will be no "fall back" behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried. .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``, see the examples below. Examples -------- >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', ... 'foo', 'bar'], ... 'B' : ['one', 'one', 'two', 'three', ... 'two', 'two'], ... 'C' : [1, 5, 5, 2, 5, 5], ... 'D' : [2.0, 5., 8., 1., 2., 9.]}) >>> grouped = df.groupby('A') >>> grouped.transform(lambda x: (x - x.mean()) / x.std()) C D 0 -1.154701 -0.577350 1 0.577350 0.000000 2 0.577350 1.154701 3 -1.154701 -1.000000 4 0.577350 -0.577350 5 0.577350 1.000000 Broadcast result of the transformation >>> grouped.transform(lambda x: x.max() - x.min()) C D 0 4 6.0 1 3 8.0 2 4 6.0 3 3 8.0 4 4 6.0 5 3 8.0 .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``, for example: >>> grouped[['C', 'D']].transform(lambda x: x.astype(int).max()) C D 0 5 8 1 5 9 2 5 8 3 5 9 4 5 8 5 5 9 a� Aggregate using one or more operations over the specified axis. Parameters ---------- func : function, str, list or dict Function to use for aggregating the data. If a function, must either work when passed a {klass} or when passed to {klass}.apply. Accepted combinations are: - function - string function name - list of functions and/or function names, e.g. ``[np.sum, 'mean']`` - dict of axis labels -> functions, function names or list of such. Can also accept a Numba JIT function with ``engine='numba'`` specified. Only passing a single function is supported with this engine. If the ``'numba'`` engine is chosen, the function must be a user defined function with ``values`` and ``index`` as the first and second arguments respectively in the function signature. Each group's index will be passed to the user defined function and optionally available for use. .. versionchanged:: 1.1.0 *args Positional arguments to pass to func. engine : str, default None * ``'cython'`` : Runs the function through C-extensions from cython. * ``'numba'`` : Runs the function through JIT compiled code from numba. * ``None`` : Defaults to ``'cython'`` or globally setting ``compute.use_numba`` .. versionadded:: 1.1.0 engine_kwargs : dict, default None * For ``'cython'`` engine, there are no accepted ``engine_kwargs`` * For ``'numba'`` engine, the engine can accept ``nopython``, ``nogil`` and ``parallel`` dictionary keys. The values must either be ``True`` or ``False``. The default ``engine_kwargs`` for the ``'numba'`` engine is ``{{'nopython': True, 'nogil': False, 'parallel': False}}`` and will be applied to the function .. versionadded:: 1.1.0 **kwargs Keyword arguments to be passed into func. Returns ------- {klass} See Also -------- {klass}.groupby.apply : Apply function func group-wise and combine the results together. {klass}.groupby.transform : Aggregate using one or more operations over the specified axis. {klass}.aggregate : Transforms the Series on each group based on the given function. Notes ----- When using ``engine='numba'``, there will be no "fall back" behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried. Functions that mutate the passed object can produce unexpected behavior or errors and are not supported. See :ref:`gotchas.udf-mutation` for more details. .. versionchanged:: 1.3.0 The resulting dtype will reflect the return value of the passed ``func``, see the examples below. {examples}c @ s4 e Zd ZdZdd�dd�Zdd� Zdd �d d�ZdS ) �GroupByPlotzE Class implementing the .plot attribute for groupby objects. �GroupBy)�groupbyc C s || _ d S )N)�_groupby)�selfrG � rJ �Q/home/digitalm-up/venv/lib/python3.7/site-packages/pandas/core/groupby/groupby.py�__init__ s zGroupByPlot.__init__c s � �fdd�}d|_ | j�|�S )Nc s | j � ��S )N)�plot)rI )�args�kwargsrJ rK �f s zGroupByPlot.__call__.<locals>.frM )�__name__rH �apply)rI rN rO rP rJ )rN rO rK �__call__ s zGroupByPlot.__call__�str)�namec s � �fdd�}|S )Nc s � ��fdd�}�j �|�S )Nc s t | j��� ��S )N)�getattrrM )rI )rN rO rU rJ rK rP s z0GroupByPlot.__getattr__.<locals>.attr.<locals>.f)rH rR )rN rO rP )rU rI )rN rO rK �attr s z%GroupByPlot.__getattr__.<locals>.attrrJ )rI rU rW rJ )rU rI rK �__getattr__ s zGroupByPlot.__getattr__N)rQ � __module__�__qualname__�__doc__rL rS rX rJ rJ rJ rK rE s rE rF zIterator[GroupBy])rG �returnc c s"