Research Watch

Using sibling data still finds no treatment effect of family group conferencing on child welfare outcomes

Year of Publication
Reviewed By
Sydney Duder

Berzin, S.C. (2006). Using sibling data to understand the impact of family group decision-making on child welfare outcomes. Children and Youth Services Review, 28(12), 1449-1458.


Studies of the impact of family group decision-making (FGDM) on child welfare outcomes have suffered from small sample sizes and lack of adequate comparison groups. Since FGDM is administered at the family level, the use of sibling data provides a possible way of compensating for small sample sizes and low enrollment rates. This study used sibling data from California's Title IV-E Waiver Demonstration Project Evaluation to compare child welfare outcomes in two counties for families receiving FGDM (Fresno County, n = 110; Riverside County, n = 87) to families receiving traditional child welfare services (Fresno County, n = 74; Riverside County, n = 52). After controlling for clustering effects using general estimating equations (GEE), group differences in child maltreatment, placement stability, and permanence were modeled using linear and logistic regression. Outcomes from both counties showed no evidence of group differences in child welfare indicators; these neutral outcomes raise questions about the efficacy of FGDM. This study confirmed previous findings with smaller sample sizes on the impact of FGDM, but Berzin argued that it demonstrated a methodological improvement over individual or family-case analyses.

Methodological Notes

This seems to be a genuine effort to implement a rigorous experimental design. Families were randomized to treatment conditions, and sophisticated multivariate statistical procedures were used to deal with “the clustering effects for siblings from the same family group,” and to control for child’s time in the study and past occurrences of maltreatment. Three potentially-important outcomes—child safety, placement stability and permanency-related outcomes—were measured. No improvement was found; in fact the impact of FGDM was actually slightly negative. Berzin suggested that a negative effect on maltreatment rates may have been the result of hyper-vigilance by the social worker or higher rates of reporting by other family members.

There were, however, significant problems with actual program delivery: (1) Practice methods, child age ranges and FGDM models differed somewhat in the two counties—there was no consistent intervention with a defined client group; (2) There is no report of any effort to monitor actual program implementation; (3) As some workers were using both methods with different families, there may have been some methodological leakage between groups. Altogether, this suggests a possible “Type III error,” with the planned intervention not fully or consistently implemented.

Berzin argued that the use of sibling data with FGDM is a methodological improvement, because it increases the sample size and compensates for low enrollment. Using children instead of families as the unit of analysis certainly increases N; but the intervention involved family group decision-making, and the actual number of families was unchanged. The issue here seems to be the change in the nature of the family data (inclusion of sibling information) and the corresponding change in statistical procedures, rather than a simple increase in "sample size."

In summary, this project combined a rigorous experimental design and sophisticated data analysis—this last being the one thing that Berzin could actually control. However, the unsatisfactory program implementation, combined with the absence of any measurable impact, raises serious questions about the application of these findings.